Re: [R-sig-eco] Clustering large data

2008-10-07 Thread Peter Solymos
Dear Thierry,

the 'mefa' package should do this, and I am also interested in the
testing of the package for such a large number of species. I have used
it before with 75K records, but only with ~160 species and 1052 sites.
So please let me know if it worked!

You can do the clustering like this (SAMPLES and SPECIES are the two
column in the long format, have to be the same length):

x <- mefa(stcs(data.frame(SAMPLES,SPECIES)))
cl <- hclust(dist(x$xtab))

Hope this works,

Peter

Peter Solymos, PhD
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
CANADA



On Tue, Oct 7, 2008 at 4:12 AM, ONKELINX, Thierry
<[EMAIL PROTECTED]> wrote:
> Dear all,
>
> We have a problem with a large dataset that we want to cluster. The
> dataset is in a long format: 1154024 rows with presence data. Each row
> has the name of the species and the location. We have 1381 species and
> 6354 locations.
> The main problem is that we need the data in wide format (one row for
> each location, one column for each species) for the clustering
> algorithms. But the 6354 x 1381 dataframe is too big to fit into the
> memory. At least when we use cast from the reshape package to convert
> the dataframe from a long to a wide format.
>
> Are there any clustering tools available that can work with the data in
> a long format or with sparse matrices (only 13% of the matrix is
> non-zero)? If the work with sparse matrices: how to convert our dataset
> to a sparse matrix? Other suggestions are welcome.
>
> We are working with R 2.7.2 on WinXP with 2 GB RAM. --max-mem-size is
> set to 2047M.
>
> Thanks,
>
> Thierry
>
>
> 
> 
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and Forest
> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
> methodology and quality assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> tel. + 32 54/436 185
> [EMAIL PROTECTED]
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to
> say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of
> data.
> ~ John Tukey
>
> Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
> en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd 
> is
> door een geldig ondertekend document. The views expressed in  this message
> and any annex are purely those of the writer and may not be regarded as 
> stating
> an official position of INBO, as long as the message is not confirmed by a 
> duly
> signed document.
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Question on height for hclust function

2008-11-17 Thread Peter Solymos
Dear Leigh,

The Ward method is minimizing the within cluster sum of squares of the
distances. So it is not easy to back-scale it to reflect original
distances. Instead you should try *linkage methods, see ?hclust. To
read about the Ward (Ward-Orloci) method see:
- Ward 1963 JASA 58: 236-244
- Orloci 1967 J Ecol 55: 193-206

Yours,

Peter

On Mon, Nov 17, 2008 at 12:40 PM, Leigh Fall <[EMAIL PROTECTED]> wrote:
> I've run a cluster analysis with Jaccard distance and Ward's method.  The
> clustering height (located on the left of the dendrogram) is not scaled to
> the distance function because the height values range from 0 to 3.5 in
> increments of 0.5.  In Oksanen's Vegan tutorial, the examples of the cluster
> dendrograms show height values that appear to be the Bray distance values
> (with various linkage methods) because the height values are less than 1.
> I'm not sure what my height values reflect.  Is the clustering height
> associated with Wards?  Can the the height values be rescaled to the Jaccard
> values?
>
> Many thanks,
> Leigh
>
>
> --
> 
> Leigh M. Fall
> Ph.D. Candidate
> Dept. of Geology and Geophysics
> Texas A&M University
> 3115 TAMU
> College Station, TX  77843
> Phone: (979) 845-3071
> E-mail: [EMAIL PROTECTED]
>
> 
>
>[[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Subset by family name?

2008-11-29 Thread Peter Solymos
Hi All,

maybe a more transparent solution for the zombie factor problem
(dropping unused factor levels) for data frames is (note, this applies
for all factors in the data frame x):

x[] <- lapply(x, function(x) x[drop = TRUE])

As I recall, on the help page of factor(), there is a slight warning
against character or numeric coercion of factors.

Cheers,

Peter


On Sat, Nov 29, 2008 at 9:25 AM,  <[EMAIL PROTECTED]> wrote:
> This can still be a problem after subsetting with zombie factors hanging
> around.  It's particularly annoying when your boxplotting from a subset,
> as you'll have a bunch of empty entries in the plot.  I have a function I
> call purgef that deals with eliminating levels of a factor that I have
> subsetted out.
>
> purgef<-function(x){
>  x<-as.character(x)
>  x<-as.factor(x)
>  return(x)
> }
>
> Gets rid of those pesky zombie levels.
>
> In your case
>
> Yut_are$Mark<-purgef(Yut_are$Mark)
>
>> Sorry to bother everyone---I realized I should have used "==" instead
>> of "=" in the subset syntax!
>>
>>
>> Quoting Ophelia Wang <[EMAIL PROTECTED]>:
>>
>>> Hi all,
>>>
>>> I thought this should be very simple, but I'm not sure where the
>>> problem is. I have a .txt data file that contains X and Y coordinates
>>> of trees and their family names:
>>>
>>> "X"  "Y" "Mark"
>>> 028  "Sapotaceae"
>>> 130  "Meliaceae"
>>> 140  "Meliaceae"
>>> 160  "Mimosaceae"
>>> 176  "Olacaceae"
>>> 1.5  73  "Myristicaceae"
>>> 234  "Euphorbiaceae"
>>> 262  "Olacaceae"
>>> 286  "Mimosaceae"
>>> 2.5  36  "Arecaceae"
>>> 322  "Nyctaginaceae"
>>> 325  "Moraceae"
>>> 338  "Rubiaceae"
>>> 347  "Desconocido "
>>> 399  "Mimosaceae"
>>> 3.5  24  "Anacardiaceae"
>>> 3.5  57  "Sapotaceae"
>>> 41   "Lecythidaceae"
>>>
>>> Now I just want to work on one family for various spatial analyses in
>>> ads and spatstats, so I wrote:
>>>
>>> Yut <-read.delim(
>>> "C:/dissertation/data2006/Parcela_1-3/Yutsun_tree.txt", header = TRUE,
>>> sep = "\t", quote="\"", dec=".", fill = TRUE )
>>>
>>> Yut_are <- subset (Yut, Mark="Arecaceae", select=c(X, Y, Mark))
>>>
>>> However, the summary of Yut_are still contains trees of other families:
>>>
>>>   XYMark
>>>  Min.   :  0.00   Min.   : 0.00   Myristicaceae: 65
>>>  1st Qu.: 24.00   1st Qu.:24.00   Lecythidaceae: 60
>>>  Median : 46.00   Median :51.00   Sapotaceae   : 51
>>>  Mean   : 48.07   Mean   :49.72   Moraceae : 45
>>>  3rd Qu.: 72.50   3rd Qu.:75.50   Arecaceae: 41
>>>  Max.   :100.00   Max.   :99.00   Mimosaceae   : 34
>>>   (Other)  :313
>>>
>>> Please tell me how do I subset a dataset like this to extract trees
>>> from only one or a few families? Thanks a lot!
>>>
>>> Ophelia
>>>
>>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Constructing a ZSM SAD in R

2009-01-11 Thread Peter Solymos
Dear Roy,
I haven't done this, but I would start with the function zsm in the
package untb by Robin Hankin. See also the function etienne.
Yours,
Péter

Péter Sólymos, PhD
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada


On Sun, Jan 11, 2009 at 12:33 PM, Roy Martin  wrote:
> Hi all,
>
> I was hoping someone on the board would have experience constructing the
> zero-sum multinomial (ZSM) species abundance distribution (SAD) in R.
> Unfortunately, my expertise in probability theory and R programming appear
> limited, which has left me largely confused about how to construct the
> expected SAD dataset in R using the maximum likelihood estimates of 'theta'
> and 'm' that I have obtained for my dataset using the methods and PARI/GP
> code provided in Etienne (2005).  If anyone has any experience with this in
> R, I would be extremely grateful for any suggestions.
>
> Sincerely,
>
> Roy Martin
> PhD Candidate
> Wildlife and Fisheries Resources
> West Virginia University
> Morgantown, WV 26506
> rmart...@mix.wvu.edu
> roywo...@gmail.com
>
> Etienne, R.S. 2005. A new sampling formula for neutral biodiversity. Ecology
> Letters 8: 253-260.
>
>[[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Multivariate regression tree (mvpart)

2009-02-20 Thread Peter Solymos
Dear Manuel and Wilfried,
the ctree function in the party package for recursive part(y)itioning
can handle multivariate response. There is also a vignette:
http://cran.r-project.org/web/packages/party/vignettes/party.pdf
Best,
Péter

Péter Sólymos, PhD
Postdoctoral Fellow
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada
email <- paste("solymos", "ualberta.ca", sep = "@")



On Thu, Feb 19, 2009 at 11:29 PM, Wilfried Thuiller
 wrote:
> Dear Manuel,
> This is the deviance of the node.
>
> You can have an access to it by:
>
> object$frame
>
> As far as I know, there is no other multivariate regression tree package in
> R.
>
> All the best
> Wilfried
>
>
>
>
> Manuel Spínola a écrit :
>>
>> Dear list members,
>>
>> I am working with the mvpart package.
>> The plot for a multivariate tree give 2 numbers associated to each
>> terminal node, one is the number of sampling units. Does anybody know what
>> the other number mean?
>> Also, is there any other package that handle multivariate regression
>> trees?
>> Thank you very much in advance.
>> Best,
>>
>> Manuel
>>
>
> --
> --
> Dr. Wilfried Thuiller
> Laboratoire d'Ecologie Alpine, UMR-CNRS 5553
> Université J. Fourier
> BP 53, 38041 Grenoble Cedex 9, France
> Tel: +33 (0)4 76 63 54 53
> Fax: +33 (0)4 76 51 42 79
>
> Email: wilfried.thuil...@ujf-grenoble.fr
> Home page: http://www.will.chez-alice.fr
> Website: http://www-leca.ujf-grenoble.fr/equipes/tde.htm
>
> FP6 European MACIS project: http://www.macis-project.net
> FP6 European EcoChange project: http://www.ecochange-project.eu
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Possion model for paired data

2009-03-06 Thread Peter Solymos
Hi Manuel,

I would suggest to use the signed difference. This will be Skellam
distribution with expected value mu1-mu2 (means of the two Poisson
distr) and variance mu1+mu2. The skellam and vglm functions in the
VGAM package can be used for a likelihood ratio test for equal means
(see example(skellam)). Please let me know if this works, because I
haven't tried yet.

Péter

Péter

Péter Sólymos, PhD
Postdoctoral Fellow
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada
email <- paste("solymos", "ualberta.ca", sep = "@")



On Fri, Mar 6, 2009 at 7:00 AM, Manuel Spínola  wrote:
> Dear list memebers,
>
> Is there a way to do a paired t test but using a poisson model (for counts).
> Thank you very much in advance.
> Best,
>
> Manuel Spínola
>
>        [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] mixed model for repeat obs

2009-03-24 Thread Peter Solymos
Hi Kate,

You can use time series analysis (ar, arima functions at first)
instead, because YEAR and WEEK clearly has structure (i.e.
observations are conditional on previous observations with some lag).
To control for SITE, you can use polynomials of the geographical
coordinates (or write a hierarchical space-time series model say in
WinBUGS). Anyway, you should decide what are your parameters of
interest, and what you consider as nuisance parameters, and formulate
the model accordingly.

Cheers,

Péter

Péter Sólymos, PhD
Postdoctoral Fellow
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada
email <- paste("solymos", "ualberta.ca", sep = "@")



On Tue, Mar 24, 2009 at 9:35 AM, CL Pressland
 wrote:
> I have a data set that is unbalanced and consists of:
>
> 67 SITEs measured over several YEARs every WEEK for butterflies (LEPS per
> m). I'm interested in the MANagement code assigned to each site, but I have
> also data on TEMPerature, average SUN and WIND. My guess is that a linear
> mixed model would be most appropriate and have constructed this code:
>
> model1<-lme(LEPS~MAN,random=~YEAR/WEEK|SITE)
>
> The output gives me:
> 
> Linear mixed-effects model fit by REML
> Data: NULL
>       AIC       BIC   logLik
>  -37631.24 -37566.48 18824.62
>
> Random effects:
> Formula: ~YEAR/WEEK| SITE
> Structure: General positive-definite, Log-Cholesky parametrization
>                 StdDev       Corr
> (Intercept)     5.875102e-03 (Intr) YEAR
> YEAR            1.392439e-06 -0.164
> YEAR:WEEK               5.068196e-07  0.531  0.301
> Residual        3.532589e-02
>
> Fixed effects: LEPS ~ MAN
>                 Value   Std.Error   DF  t-value p-value
> (Intercept) 0.009866718 0.001428957 9793 6.904841    0.00
> MAN             0.28304 0.001127429   65 0.025105    0.98
> Correlation:
>       (Intr)
> MAN -0.685
>
> Standardized Within-Group Residuals:
>       Min          Q1         Med          Q3         Max
> -2.70566579 -0.40089121 -0.18073723  0.05900735 19.16411466
>
> Number of Observations: 9860
> Number of Groups: 67
> 
>
> This output confuses me greatly! I figure that this clearly means that
> management has no effect on butterflies but how can I figure out what effect
> SITE, YEAR and WEEK have on the data? Would I have to also include them in
> the fixed effects side of the formula (I'm unsure if this is allowed)? Also,
> how could I include my weather variables? Would they just be placed on the
> fixed effect side of the formula? If they are correlated (I expect the
> weather variables are) would I have to place them in an interaction rather
> than separately?
>
> Any help that would be given would be gratefully received!
>
> Kate
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] testing for distribution

2009-05-13 Thread Peter Solymos
Dear Jacob,

Erika was right, you just have to perform a goodness of fit test. Bit
it is easier
to inspect your residual deviance.
It follows a Chi-sqared distribution, where the expected value should
be close to
the degrees of freedom if the fit is good. To get a P value for an
object of class
"negbin" (inheriting from glm and lm), use (note, H0: the fit is good):

library(MASS)
mod <- glm.nb(...your model...)
1-pchisq(mod$deviance, mod$df.residual)

If you are using other functions (i.e. in package pscl), the structure
of the returned object might change,
in this case simply type the numbers instead.

Cheers,

Péter

Péter Sólymos, PhD
Postdoctoral Fellow
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada
email <- paste("solymos", "ualberta.ca", sep = "@")



On Wed, May 13, 2009 at 12:17 PM, Erika Mudrak  wrote:
> Jacob-  You can use a Chi-squared goodness of fit - chisq.test() for discrete 
> distributions like the negative binomial and a Kolmogorov-Smirnoff test- 
> ks.test() for continuous distributions.      They will both produce a p-value 
> which tests the null hypothesis that your data come from the given 
> distribution with stated parameters.    Use the parameter estimates from your 
> fitdistr() results. So if p>0.05 (or 0.1 or whatever), your data come from 
> that distribution.
>
> For Discrete distributions, try something like:
> fit=fitdistr(.)
> chisq.test(x=ActualData, y=rnbinom(n=length(ActualData), k=fit.k, mu=fit.mu))
> #I think this is right, I haven't actually tried it...
> # This is akin to quantitatively comparing your histograms...
>
>
> For continous distributions (such as beta), the code would be this:
> fit=fitdistr(...)
> ks.test(ActualData, "pbeta", shape1=fit$estimate[1],shape2=fit$estimate[2])
> # I've done this successfully
>
> You can use AIC to test if another distribution fits your data better than 
> negative binomial does.  I think it's possible for your data to "pass" the 
> Chi-Squared/Kolmogorov-Smirnoff test for two different distributions, but it 
> will fit one better than another.
>
> Erika Mudrak
>
>
> ---
> Erika Mudrak
> Graduate Student
> Department of Botany
> University of Wisconsin-Madison
> 430 Lincoln Dr
> Madison WI, 53706
> 608-265-2191
> mud...@wisc.edu
>
> - Original Message -
> From: "Capelle, Jacob" 
> Date: Tuesday, May 12, 2009 11:00 am
> Subject: [R-sig-eco]  testing for distribution
> To: r-sig-ecology@r-project.org
>
>
>> Dear all,
>>
>> I have a kind of a theoretical question from which I hope it might
>> interest you and hopefully can help me a bit.
>>
>> In order to obtain ecological (surrvey) data, I try to make a
>> prediction about the accuracy of a sampling tool to estimate mussel
>> density. For this reason I took a lot of samples at a certain fixed
>> location and counted the amount of mussels in each sample. Because
>> mussels are aggregated on the sediment, I had a lot of zero values. To
>> estimate the sample size I used a binomial distribution and obtained
>> the k value and the mu from the fitdistr(x,"negative binomial") (MASS).
>>
>> The question I have is: how can I test if this distribution accurately
>> described my (zero inflated count) data?
>>
>> I am a bit familiar with the AIC but since I only have counts on one
>> variable I cannot perform a GLS.
>> Creating a vector with rnbinom() using the k and mu from the
>> fitdistr() I plotted a histogram and compared it with my data, this
>> showed that is was roughly comparable, but I want to quantify this.
>>
>> I have a biological background not a statistical one, so I realize I
>> can ask silly questions.
>> But I hope someone can give me some hints.
>>
>> Kind regards,
>>
>> Jacob Capelle
>>
>> PhD student
>> Wageningen Imares
>> The Netherlands
>> jacob.cape...@wur.nl <
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] problem using reshape package

2009-07-09 Thread Peter Solymos
Hi,

something like this might help:

## your toy data set
d <- data.frame(SITE=rep(c("I","II"), each=3),
SPECIES=rep(LETTERS[1:3], 2),
Replicate1=c(1,2,1,4,1,6),
Replicate2=c(3,5,0,2,0,3),
Replicate3=c(0,1,2,0,0,3))

## required for functions inflate, stcs and mefa
library(mefa)

## this repeats SITE and SPECIES tags,
## and puts counts and replicate in a data frame
x <- data.frame(inflate(d[,1:2], rep(3, 6)),
Counts=array(t(d[,3:5])),
Replicate=rep(1:3, 6))

## something you wanted, except for NA's with xtabs (stats)
y1 <- xtabs(Counts ~ interaction(x$SITE, x$Replicate) + SPECIES, x)
## replicates cross tabulated separately
y2 <- xtabs(Counts ~ SITE + SPECIES + Replicate, x)
## same with mefa (will give you warnings
## due to some 'empty sample' misspecifications)
m <- mefa(stcs(x))
m$segm

Yours,

Peter

Peter Solymos, PhD
Postdoctoral Fellow
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada
email <- paste("solymos", "ualberta.ca", sep = "@")



On Thu, Jul 9, 2009 at 5:37 AM, Maria Dulce
Subida wrote:
> Hello everyone!
>
> I'm having a problem in casting a data frame with the reshape package. I
> have an original data set of species abundances in replicate samples at
> certain sites, with the following form:
>
>
>
> SITE   SPECIES   Replicate1   Replicate2   Replicate3
>
>   I               A              1                   3                 0
>
>   I               B              2                   5                1
>
>   I               C              1                   0                 2
>
>   II             A              4                   2                0
>
>   II             C              1                   0                0
>
>   II             D              6                   3                3
>
>
>
> Please notice that site II does not have species B and has a new species D,
> the remaining two are shared with site I.
>
>
>
> I need to get these data in the form of a matrix like:
>
>
>
> SITE.REPLICATE     A          B         C         D
>
>            I.1                   1          2          1          NA
>
>            I.2                   3          5          0          NA
>
>            I.3                   0          1          2          NA
>
>            II.1                  4          NA      1          6
>
>            II.2                  2          NA      0          3
>
>            II.3                  0          NA      0          3
>
>
>
> Using the above "toy data" in R, everything works fine using melt and recast
> as follows (lets call test to may initial matrix):
>
>> testm <- melt (test, id.var=c("SITE","SPECIES"))
>
>> testc <- cast(testm, ...~SPECIES)
>
>> testc
>
>   SITE   variable    A  B C  D
>
> 1    I    Replicate1  1  2 1 NA
>
> 2    I    Replicate2  3  5 0 NA
>
> 3    I    Replicate3  0  1 2 NA
>
> 4   II    Replicate1  4 NA 1  6
>
> 5   II    Replicate2  2 NA 0  3
>
> 6   II    Replicate3  0 NA 0  3
>
>
>
>  However, when I use the same code in my real data set which is considerably
> larger (73 sites and 7 replicates for site, resulting in a molten matrix of
> 14469 x 4), the cast function does some kind of aggregation (in fact it
> advices of an aggregation using the default fun.aggregate) that I was not
> able to understand. I also tried to split my original data frame in order to
> get molten matrices smaller than 5500x4, but I got the same problem.
>
> Could anyone help me with this?
>
>
>
>  (I use R 2.8.1 for Windows)
>
>
>
> Thank you very very much in advance!
>
>
>
>
>
> Cheers,
>
>
>
> Dulce
>
>
>
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] How to convert matrix to paired list?

2009-07-23 Thread Peter Solymos
Dear Jin-Long,

You can try this:

x <- cbind(rnorm(10), rnorm(10), rnorm(10), rnorm(10))
y <- cor(x)
y
library(mefa)
z <- as.data.frame(as.dist(y))
z

Yours,

Peter

Peter Solymos, PhD
Postdoctoral Fellow
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada
email <- paste("solymos", "ualberta.ca", sep = "@")



On Thu, Jul 23, 2009 at 4:49 PM, Zhang Jinlong wrote:
> Dear lists
>  Suppose I have an correlation matrix of four places named "A","B","C"and"D", 
> as below:
>    A     B      C    D
> A   0    0.5   0.2   0.8
> B  0.5   0     0.6   0.4
> C  0.2   0.6   0      0.4
> D  0.8   0.4   0.4    0
> Is there a function that allows me to generate the paired list like this?
> A B 0.5
> A C 0.2
> A D 0.8
> B C 0.6
> B D 0.4
> C D 0.4
> Could you help me if you know a method. Any suggestions will be appreciated.
> Thanks
> Jinlong
>
>
> Jin-Long ZHANG
> Ph.D. Candidate
> Institute of Botany
> Chinese Academy of Sciences
> Beijing 100093
> E-mail: zhan...@ibcas.ac.cn
>
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] How to convert matrix to paired list?

2009-07-24 Thread Peter Solymos
Hi,

The 'as.data.frame.dist' function requires the mefa package,
that's why I wrote the line 'library(mefa)'. This returns the
lower triangle only, but it does not return the row/col names.

The melt method, however returns the full matrix, not only
the lower triangle, see below.

Best,

Peter

--

x <- cbind(rnorm(10), rnorm(10), rnorm(10), rnorm(10))
colnames(x) <- LETTERS[1:4]
y <- cor(x)

> x <- cbind(rnorm(10), rnorm(10), rnorm(10), rnorm(10))
> colnames(x) <- LETTERS[1:4]
> y <- cor(x)
> y
A   B  C  D
A  1. -0.01444285  0.3969146  0.1356350
B -0.01444285  1. -0.6151891  0.6649487
C  0.39691455 -0.61518910  1.000 -0.3847042
D  0.13563501  0.66494872 -0.3847042  1.000
> library(mefa)
> as.data.frame(as.dist(y))
  row coldist
1   2   1 -0.01444285
2   3   1  0.39691455
3   4   1  0.13563501
4   3   2 -0.61518910
5   4   2  0.66494872
6   4   3 -0.38470419
> library(reshape)
> melt(y)
   X1 X2   value
1   A  A  1.
2   B  A -0.01444285
3   C  A  0.39691455
4   D  A  0.13563501
5   A  B -0.01444285
6   B  B  1.
7   C  B -0.61518910
8   D  B  0.66494872
9   A  C  0.39691455
10  B  C -0.61518910
11  C  C  1.
12  D  C -0.38470419
13  A  D  0.13563501
14  B  D  0.66494872
15  C  D -0.38470419
16  D  D  1.

On Thu, Jul 23, 2009 at 4:42 PM, Zhang Jinlong wrote:
> How to convert matrix to paired list?
> Dear lists
>  Suppose I have an correlation matrix of four places named "A","B","C"and"D", 
> as below:
>    A     B      C    D
> A   0    0.5   0.2   0.8
> B  0.5   0     0.6   0.4
> C  0.2   0.6   0     0.4
> D  0.8   0.4   0.4   0
> Is there a function that allows me to generate the paired list like this?
> A B 0.5
> A C 0.2
> A D 0.8
> B C 0.6
> B D 0.4
> C D 0.4
> Could you help me if you have an method. Any suggestions will be appreciated.
> Thanks
>
> Jinlong
>
>
> Jin-Long ZHANG
> Ph.D. Candidate
> Institute of Botany
> Chinese Academy of Sciences
> Beijing 100093
> E-mail: zhan...@ibcas.ac.cn
>
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] How to convert matrix to paired list?

2009-07-24 Thread Peter Solymos
No worries!

I just added the option 'dim.names = TRUE' by which you
can get the dimnames back, but you still need the
matrix > dist > data.frame coercion chain.

as.data.frame.dist <-
function (x, row.names = NULL, optional = FALSE, dim.names = FALSE, ...)
{
if (!missing(optional))
.NotYetUsed("optional", error = FALSE)
id <- as.matrix(x)
id[lower.tri(id)] <- 1
id[upper.tri(id)] <- 0
diag(id) <- 0
rm <- row(id)
cm <- col(id)
rm <- array(rm)[array(id) == 1]
cm <- array(cm)[array(id) == 1]
out <- data.frame(row=rm, col=cm, dist=dist2vec(x))
if (!is.null(row.names))
rownames(out) <- row.names
if (dim.names) {
out$row <- as.factor(out$row)
out$col <- as.factor(out$col)
levels(out$row) <- rownames(id)
levels(out$col) <- colnames(id)
}
out
}

dist2vec
function (x)
{
attributes(x) <- NULL
x
}

Cheers,

Peter



2009/7/24 Zhang Jinlong :
> Dear Peter,
>    Sorry for bothering you so much.
>    You are right,the function melt() in reshape package returns to full
> pairs of data. The method you write give exactly what I needed.There's
> another function, which was written by my friend Guofang, could solve the
> trouble(see below).
>    Thank you very much indeed.
>    Best Regards
>
>   Yours Jin
>
>
>
> read.tri.matrix=function(d)
> {
> if(is.null(colnames(d)))colnames(d)=paste('V',1:dim(d)[2],sep="")
> if(!is.matrix(d)) as.matrix(d)->d
> m=lower.tri(d) # 提取矩阵下三角
> colnames(m)=row.names(d)
> rownames(m)=row.names(d)
> as.data.frame(m)->m
> stack(m)->
> 
> names()[2]="contrast1"
> rep(colnames(d),dim(d)[1])->$contrast2
> d=as.data.frame(d)
> stack(d)$values->$value
> [$values,-1]->result
> row.names(result)=NULL
> result
> }
> # by Guofang LIU from IBCAS
>
>
>
>
>
>
>
>
>
>> Hi,
>
>>
>
>> The 'as.data.frame.dist' function requires the mefa package,
>
>> that's why I wrote the line 'library(mefa)'. This returns the
>
>> lower triangle only, but it does not return the row/col names.
>
>>
>
>> The melt method, however returns the full matrix, not only
>
>> the lower triangle, see below.
>
>>
>
>> Best,
>
>>
>
>> Peter
>
>>
>
>> --
>
>>
>
>> x <- cbind(rnorm(10), rnorm(10), rnorm(10), rnorm(10))
>
>> colnames(x) <- LETTERS[1:4]
>
>> y <- cor(x)
>
>>
>
>> > x <- cbind(rnorm(10), rnorm(10), rnorm(10), rnorm(10))
>
>> > colnames(x) <- LETTERS[1:4]
>
>> > y <- cor(x)
>
>> > y
>
>> A   B  C  D
>
>> A  1. -0.01444285  0.3969146  0.1356350
>
>> B -0.01444285  1. -0.6151891  0.6649487
>
>> C  0.39691455 -0.61518910  1.000 -0.3847042
>
>> D  0.13563501  0.66494872 -0.3847042  1.000
>
>> > library(mefa)
>
>> > as.data.frame(as.dist(y))
>
>>   row coldist
>
>> 1   2   1 -0.01444285
>
>> 2   3   1  0.39691455
>
>> 3   4   1  0.13563501
>
>> 4   3   2 -0.61518910
>
>> 5   4   2  0.66494872
>
>> 6   4   3 -0.38470419
>
>> > library(reshape)
>
>> > melt(y)
>
>>X1 X2   value
>
>> 1   A  A  1.
>
>> 2   B  A -0.01444285
>
>> 3   C  A  0.39691455
>
>> 4   D  A  0.13563501
>
>> 5   A  B -0.01444285
>
>> 6   B  B  1.
>
>> 7   C  B -0.61518910
>
>> 8   D  B  0.66494872
>
>> 9   A  C  0.39691455
>
>> 10  B  C -0.61518910
>
>> 11  C  C  1.
>
>> 12  D  C -0.38470419
>
>> 13  A  D  0.13563501
>
>> 14  B  D  0.66494872
>
>> 15  C  D -0.38470419
>
>> 16  D  D  1.
>
>>
>
>> On Thu, Jul 23, 2009 at 4:42 PM, Zhang Jinlong wrote:
>
>> > How to convert matrix to paired list?
>
>> > Dear lists
>
>> > 燬uppose I have an correlation matrix of four places named "A","B","C"and"D", as below:
>
>> > ?燗 ??B ??燙 ?燚
>
>> > A ?0 ??.5 ?0.2 ?0.8
>
>> > B ?.5 ?0 ??0.6 ?0.4
>
>> > C ?.2 ?0.6 ?0 ??0.4
>
>> > D ?.8 ?0.4 ?0.4 ?0
>
>> > Is there a function that allows me to generate the paired list like this?
>
>> > A B 0.5
>
>> > A C 0.2
>
>> > A D 0.8
>
>> > B C 0.6
>
>> > B D 0.4
>
>> > C D 0.4
>
>> > Could you help me if you have an method. Any suggestions will be appreciated.
>
>> > Thanks
>
>> >
>
>> > Jinlong
>
>> >
>
>> >
>
>> > Jin-Long ZHANG
>
>> > Ph.D. Candidate
>
>> > Institute of Botany
>
>> > Chinese Academy of Sciences
>
>> > Beijing 100093
>
>> > E-mail: zhan...@ibcas.ac.cn
>
>> >
>
>> >
>
>> > ???燵[alternative HTML version deleted]]
>
>> >
>
>> > ___
>
>> > R-sig-ecology mailing list
>
>> > r-sig-ecol...@r-project.org
>
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>> >
>
>> >
>
>>
>
> Jin-Long ZHANG
> Ph.D. Candidate
> Institute of Botany
> Chinese Academy of Sciences
> Beijing 100093
> E-mail: zhan...@ibcas.ac.cn
>
>
>
>
___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Help with downloading package

2009-08-19 Thread Peter Solymos
Dear Leigh,

You have 2 options:

1. build the MAC OS X package for yourself and install it, in this way
you will be able to use the help files as usual,
2. unpack the .tar.gz and source all files in the /R directory (on how
to do it at once see Example in help(source)).

Cheers,

Peter


On Wed, Aug 19, 2009 at 3:58 PM, Leigh Fall wrote:
> Hey all,
>
> I have recently starting using R, so I'm still on the steep-learning curve
> of the program.  I have been unable to install Diveristy.R from Dr.
> Pelissier's website (http://pelissier.free.fr/Diversity.html).  I have
> downloaded the libraries "ade4" and "spdep" (along with all the other
> packages needed to run spdep).  I'm running R version 2.9.1 on a MAC OS X
> version 10.5.8.  When I click on "Package source (all platforms):
> diversity_1.5-9.tar.gz"
> from the website, a folder labeled "diversity" is downloaded onto my
> desktop.  One question I have is where the package/folder/library should be
> downloaded.   My other question is how to install it.  I've tried to Package
> Installer-Local Binary Package and Local Source Package at the user level
> without success.  I also just downloaded the tar.gz file directly to my
> desktop and then use the Package Installer, but again without success.  I
> tried the install.packages command, but I don't know if I'm using it
> correctly.  The solution is probably simple, but with my limited experience
> with writing code and programming, I'm not sure what I'm doing wrong.
>
> Any help will be appreciated!!  Many thanks in advance!!!
>
> Cheers,
> Leigh
>
>
> --
> 
> Leigh M. Fall
> Ph.D. Candidate
> Dept. of Geology and Geophysics
> Texas A&M University
> 3115 TAMU
> College Station, TX  77843
> Phone: (979) 845-3071
> E-mail: leigh.f...@gmail.com
> http://geoweb.tamu.edu/profile/LFall
> 
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] glm for ratio [0,1] data

2009-08-31 Thread Peter Solymos
Hi Bálint,

Here are my two cents.

By using LM with transformed data (which transformation can also be
logit, loglog, cloglog, probit) you loose the Binomial error
structure, because you won't follow the trial/success experiment
scheme. But percent cover is not that kind of [0,1] data where this
sampling is assumed, I think that's why you have asked :)

If your data is an estimate of a hidden response, than there must be
ways to account for this, but I can only recall an example where e.g.
Y is Poisson, but you observe it as ordinal (0, few, many). So you can
establish cutoff values to get ordinal response from you percent
cover, and use a hierarchical model in BUGS/JAGS (see WinBUGS manual
for an example).

Cheers,

Peter


On Mon, Aug 31, 2009 at 6:24 AM, Bálint Czúcz wrote:
> Dear List,
>
> does anyone know a good way to perform GLM on ratio data (i.e. data
> between 0 and 1)? Binomial GLM is quite straightforward to use if you
> have integer numbers for successes/failures. But how to proceed if you
> only have the ratio? This can occur in a multitude of ways, e.g the
> response variable is the estimated cover of a species, percentage of
> canopy lost, etc.
>
> One solution I know about is to try to transform such responses to
> normal with the arcsine-squarroot transformation, and use lm on the
> transformed response -- e.g. Crawley (2007, The R Book, p. 570.)
> explicitely suggests this strategy.
>
> But I would still be interested if there is a glm approach that could
> be used with the untransformed data. After hours spent with searching
> for literature on such a glm, I couldn't find any. Do you know of
> some?
>
> I would also be interested what happens if I just proceed with a
> binomial glm with the response being between [0,1] and weights left to
> 1. I know glm() will throw a warning -- but it also produces an
> output. Can this output contain some valid, interpretable results, or
> is it completely bullshit because of the violation of the assumptions?
>
> Thank you!
> Bálint
>
>
> --
> Bálint Czúcz
> Institute of Ecology and Botany of the Hungarian Academy of Sciences
> H-2163 Vácrátót, Alkotmány u. 2-4. HUNGARY
> Tel: +36 28 360122/137  +36 70 7034692
> magyar nyelvű blog: http://atermeszettorvenye.blogspot.com/
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] adonis model specification

2009-09-29 Thread Peter Solymos
Christine,

There is no summary method for adonis. After calling the function,
simply use print:
x <- adonis(...)
x

And you are right, you can supply raw data and use the method argument
in adonis to define dissimilarity index (which is "bray" by default).

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
email <- paste("solymos", "ualberta.ca", sep = "@")



On Tue, Sep 29, 2009 at 3:47 AM, Christine Griffiths
 wrote:
> Dear R helpers,
>
> I have collected species composition data (counts of 24 species)from 16
> plots. I am interested in seeing how different the plots are from each
> other and how this changes over time.
>
> My design is as follows: I repeatedly sampled the plots each month for 11
> months. The plots are grouped into three treatments (1-3), which were
> replicated 6 times (blocks 1-6). I have unbalanced data in that I lost 2
> replicates from 2 of the treatments, so I have only 16 plots in total.
>
>> From what I have read, I want to use the adonis function in the vegan
>
> package. Having had a look at the set up of the dune data example, I have
> set up my data in the same format. Below is an idea of my data.
> I have 16 plots * 11 months = 176 sites. Since in 3 of these there were no
> species, I only have 173 sites.
>
> dataset.plot.env[1:5,]
>  ID Area Treatment Block BlockID Month
> 1  1    1         1     1       1     1
> 2  2    1         1     2       2     1
> 3  3    1         1     3       3     1
> 4  4    2         1     4       4     1
> 5  5    2         1     5       5     1
>
> block<-as.factor(Block)
> month<-as.factor(Month)
> area<-as.factor(Area)
> treatment<-as.factor(Treatment)
>
> ### data is counts. x? are species.
> dataset.plot.count[1:5,1:15]
>  X5 X6 X7 X8 X9 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20
> 1 11  2  1  0 58  10  28 137   0   0  87  18  68  10   0
> 2  6 10  2  0 36   1  26 192  65   0   6  35  94  14   0
> 3  2  0  0  0 27  41  33 394  16   0   3  14   4  16   0
> 4 22 54  7  0 26  42  10 275 137   4  22  10  30   1   0
> 5  1  5 10  0  0  19  45 599  60   0   6   3   0   5   0
>
> library(vegan)
> data1<-vegdist(dataset.plot.count,method="morisita")
>
> ## since I am using count data I chose morisita index.
> ## I read that first I needed to convert my data into a dissimilarity
> ## matrix. Or is this not necessary if I stipulate method in the adonis
> ## function?
>
> m1<-adonis(data1~treatment,data=dataset.plot.env,permutation=200,method="mo
> risita")
> summary(m1)
>
> If I do either of the above or the following, I get this result.
> m1<-adonis(dataset.plot.count~treatment,data=dataset.plot.env,permutation=2
> 00,method="morisita")
> summary(m1)
>
>            Length Class      Mode
> aov.tab        6    data.frame list
> call           4    -none-     call
> coefficients   0    -none-     NULL
> coef.sites   519    -none-     numeric
> f.perms      200    -none-     numeric
> model.matrix 519    -none-     numeric
> terms          3    terms      call
>>
>
> I am unsure exactly of what I am doing wrong or what I need to specify. I
> would be grateful for any help and guidance.
>
> I have read that I can use strata to specify a time series. If so would I
> code it like this:
> m1<-adonis(dataset.plot.count~treatment,data=dataset.plot.env,strata=datase
> t.plot.env[,"Month"],permutation=200,method="morisita")
> ## This doesn't work either though.
>
> I had originally planned on doing MRPP but having read that it is less
> robust I opted for adonis. With MRPP, I was going to test the difference in
> species composition between treatments at each month as I think I am unable
> to test this for the whole study. Using adonis, do I need to do a number of
> tests or can I simply run the following?
>
> m1<-adonis(dataset.plot.count~treatment*month*block,data=dataset.plot.env,s
> trata=dataset.plot.env[,"Month"],permutation=200,method="morisita")
>
> I have had a look at Oksanen 2009, Multivariate Analysis of Ecological
> Communities in R: vegan tutorial. I have not been able to find much more
> information on adonis. Could anyone direct me to more literature please.
>
> Any help would be much appreciated.
>
> many thanks
> Christine
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] adonis model specification

2009-09-29 Thread Peter Solymos
Christine,

There is no summary method for adonis. After calling the function,
simply use print:
x <- adonis(...)
x

And you are right, you can supply raw data and use the method argument
in adonis to define dissimilarity index (which is "bray" by default).

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
email <- paste("solymos", "ualberta.ca", sep = "@")



On Tue, Sep 29, 2009 at 3:47 AM, Christine Griffiths
 wrote:
> Dear R helpers,
>
> I have collected species composition data (counts of 24 species)from 16
> plots. I am interested in seeing how different the plots are from each
> other and how this changes over time.
>
> My design is as follows: I repeatedly sampled the plots each month for 11
> months. The plots are grouped into three treatments (1-3), which were
> replicated 6 times (blocks 1-6). I have unbalanced data in that I lost 2
> replicates from 2 of the treatments, so I have only 16 plots in total.
>
>> From what I have read, I want to use the adonis function in the vegan
>
> package. Having had a look at the set up of the dune data example, I have
> set up my data in the same format. Below is an idea of my data.
> I have 16 plots * 11 months = 176 sites. Since in 3 of these there were no
> species, I only have 173 sites.
>
> dataset.plot.env[1:5,]
>  ID Area Treatment Block BlockID Month
> 1  1    1         1     1       1     1
> 2  2    1         1     2       2     1
> 3  3    1         1     3       3     1
> 4  4    2         1     4       4     1
> 5  5    2         1     5       5     1
>
> block<-as.factor(Block)
> month<-as.factor(Month)
> area<-as.factor(Area)
> treatment<-as.factor(Treatment)
>
> ### data is counts. x? are species.
> dataset.plot.count[1:5,1:15]
>  X5 X6 X7 X8 X9 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20
> 1 11  2  1  0 58  10  28 137   0   0  87  18  68  10   0
> 2  6 10  2  0 36   1  26 192  65   0   6  35  94  14   0
> 3  2  0  0  0 27  41  33 394  16   0   3  14   4  16   0
> 4 22 54  7  0 26  42  10 275 137   4  22  10  30   1   0
> 5  1  5 10  0  0  19  45 599  60   0   6   3   0   5   0
>
> library(vegan)
> data1<-vegdist(dataset.plot.count,method="morisita")
>
> ## since I am using count data I chose morisita index.
> ## I read that first I needed to convert my data into a dissimilarity
> ## matrix. Or is this not necessary if I stipulate method in the adonis
> ## function?
>
> m1<-adonis(data1~treatment,data=dataset.plot.env,permutation=200,method="mo
> risita")
> summary(m1)
>
> If I do either of the above or the following, I get this result.
> m1<-adonis(dataset.plot.count~treatment,data=dataset.plot.env,permutation=2
> 00,method="morisita")
> summary(m1)
>
>            Length Class      Mode
> aov.tab        6    data.frame list
> call           4    -none-     call
> coefficients   0    -none-     NULL
> coef.sites   519    -none-     numeric
> f.perms      200    -none-     numeric
> model.matrix 519    -none-     numeric
> terms          3    terms      call
>>
>
> I am unsure exactly of what I am doing wrong or what I need to specify. I
> would be grateful for any help and guidance.
>
> I have read that I can use strata to specify a time series. If so would I
> code it like this:
> m1<-adonis(dataset.plot.count~treatment,data=dataset.plot.env,strata=datase
> t.plot.env[,"Month"],permutation=200,method="morisita")
> ## This doesn't work either though.
>
> I had originally planned on doing MRPP but having read that it is less
> robust I opted for adonis. With MRPP, I was going to test the difference in
> species composition between treatments at each month as I think I am unable
> to test this for the whole study. Using adonis, do I need to do a number of
> tests or can I simply run the following?
>
> m1<-adonis(dataset.plot.count~treatment*month*block,data=dataset.plot.env,s
> trata=dataset.plot.env[,"Month"],permutation=200,method="morisita")
>
> I have had a look at Oksanen 2009, Multivariate Analysis of Ecological
> Communities in R: vegan tutorial. I have not been able to find much more
> information on adonis. Could anyone direct me to more literature please.
>
> Any help would be much appreciated.
>
> many thanks
> Christine
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Mantel test with skew-symmetric matrices?

2009-10-01 Thread Peter Solymos
Dear Steve,

If the direction is important, you can use that information as a
separate matrix with signs to scale up its effect. Because distance
can't be negative, you might end up with numbers hard to interpret.

Yours,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
email <- paste("solymos", "ualberta.ca", sep = "@")



On Thu, Oct 1, 2009 at 11:46 AM, Jari Oksanen  wrote:
>
>
>
> On 1/10/09 20:36 PM, "Sarah Goslee"  wrote:
>
>> I can only speak for the mantel() within ecodist, but I can tell you that it
>> will not take full matrices - the upper triangle will be dropped. You could
>> roll your own very easily, but it would be exceedingly slow, eg:
>>
>> mat1 <- # some square skew-symmetric matrix
>> mat2 <- # some other square skew-symmetric matrix
>>
>> mantelr <- cor(as.vector(mat1), as.vector(mat2))
>> nperm <- 100 # bigger for real problem, of course
>> permresults <- numeric(nperm)
>> permresults[1] <- mantelr
>>
>> for(thisperm in 2:nperm) {
>>    randsample <- sample(1:nrow(mat1))
>>    permresults[thisperm] <- cor(as.vector(mat1[randsample,
>> randsample]), as.vector(mat2))
>> }
>> and then use permresults to estimate your test statistic of interest.
>>
> Sarah & Steve,
>
> This was the design I had on my mind. However, I was not sure how
> skew-symmetry actually was defined, and therefore I didn't know if free
> permutation of rows and columns (even when done correctly like above) will
> retain the skew-symmetry. The free permutation would be OK for non-symmetric
> matrices, but what about skew-symmetric? (Little thinking and pen & paper
> probably would give a quick answer, but I won't do that for a while).
>
> Cheers, Jari
>
>> I haven't thought at all about any statistical issues raised by use of full
>> non-symmetric matrices - you're on your own there. It's certainly
>> *possible*, and
>> I don't see any immediate reason why it would be wrong, but haven't
>> pondered the issue.
>>
>> I see Jari replied as well while I was writing - as for vegan, the
>> ecodist function would
>> need to be heavily modified to do this. If I'm persuaded that there's
>> enough interest,
>> I could add it to the list.
>>
>> Sarah
>>
>> On Thu, Oct 1, 2009 at 1:20 PM, Steve Arnott  wrote:
>>> Hello All,
>>
>>> 1) Is it wrong to use skew-symmetric matrices - i.e. should I just forget
>>> about the skew data and use absolute values to make all my matrices
>>> symmetric? The original Mantel paper (1967, Cancer Research, 2: 209-220) 
>>> does
>>> talk about skew-symmetric matrices, but the published applications I've come
>>> across only seem to use symmetric matrices.
>>>
>>>  2) If it is ok to use skew-symmetric matrices, do the mantel() and
>>> mantel.partial() functions in 'vegan' (or related functions in other
>>> packages, such as 'ecodist') handle them correctly? I've found that it is
>>> possible to process skew data and generate results using these functions, 
>>> but
>>> I'm uncertain from the documentation whether the results are meaningful 
>>> (i.e.
>>> is the coding designed to handle such cases appropriately?)
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] R-sig-ecology Digest, Vol 19, Issue 2

2009-10-02 Thread Peter Solymos
Dear All,

I admit that overdispersion can be a problem. But you can't compare
Poisson with quasi-Poisson based on logLik, because the likelihood is
not defined for quasi* models. The quasi-likelihood can be maximized
to get the dispersion parameter, but coefficients are the same, only
SE's and p-values are corrected:

## some random data
y<-rpois(100, 3)
x<-rnorm(100)
## GLMs
m1 <- glm(y~x,family=poisson)
m2 <- glm(y~x,family=quasipoisson)
## coefficients are equal
all.equal(coef(m1), coef(m2))
## SE's are not
rbind(pois=coef(summary(m1))[,2], qpois=coef(summary(m2))[,2])
## p-values are not
rbind(pois=coef(summary(m1))[,4], qpois=coef(summary(m2))[,4])
## logLik for Poisson: OK
logLik(m1)
## logLik for Poisson: NA
logLik(m2)

The pscl package provides negative binomial models with zero inflation
too (see Achim Zeileis, Christian Kleiber, Simon Jackman:
Regression Models for Count Data in R, JSS, http://www.jstatsoft.org/v27/i08).

If you have fancier (say GLMM) models, you can make likelihood ratio
test, but that might be quite advanced to do so (see José Miguel
Ponciano, Mark L. Taper, Brian Dennis, Subhash R. Lele (2009)
Hierarchical models in ecology: confidence intervals, hypothesis
testing, and model selection using data cloning. Ecology: Vol. 90, No.
2, pp. 356-362.).

Yours,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
email <- paste("solymos", "ualberta.ca", sep = "@")



On Fri, Oct 2, 2009 at 9:53 AM, Nicholas Lewin-Koh  wrote:
>  No No No No No!
> The log likelihood of the Poisson and the Gaussian are not comparable.
> One is a discrete distribution and the other continuous, you can get
> into all
> sorts of trouble there and not just pathological cases. They are on
> totally different scales.
>
> You need to make a decision if you want to model the MEAN species
> richness as
> continuous, and not worry about answers like 3.1 species. You are
> modeling the mean.
> Or go with a discrete distribution like Poisson or quasi-Poisson, you
> can test
> for overdispersion within a discrete family of distributions.  As
> someone
> mentioned before if your counts are away from zero, the Poisson is very
> symmetric,
> and goes asymptotically to a normal. But for practical purposes your
> results
> should be similar. For small samples, ie with categorical predictors and
> few
> counts per cell, it can make a difference.
>
> So, if you want to do model selection, you have to first choose
> discrete or continuous, then within that set compare log likelihoods.
> (you are on firmer ground if the models are somehow nested).
>
> Nicholas
>> Message: 10
>> Date: Fri, 02 Oct 2009 08:29:10 +0200
>> From: Carsten Dormann 
>> Subject: Re: [R-sig-eco] Negative binomial
>> To: "Canning-Clode, Joao" 
>> Cc: "r-sig-ecology@r-project.org" 
>> Message-ID: <4ac59db6.9030...@ufz.de>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Dear Joao,
>>
>> I propose you do the following (and wait for the outcry-responses to
>> this email to see if it is a reasonable proposal):
>>
>> Fit your model with different types of distributions and compare their
>> logLik-values:
>> logLik(glm(y ~ x1+x2+x3+I(x1^2) + x1:x3, family=gaussian))
>> logLik(glm(y ~ x1+x2+x3+I(x1^2) + x1:x3, family=poisson))
>> logLik(glm(y ~ x1+x2+x3+I(x1^2) + x1:x3, family=quasipoisson))
>> logLik(glm.nb(y ~ x1+x2+x3+I(x1^2) + x1:x3)) # require(MASS)
>>
>> The model with the highest log-Likelihood is the distribution of choice
>> and you can defend it against reviewer.
>>
>> A few notes:
>> 1. You obviously cannot do this when one of the models uses transformed
>> responses (e.g. log(y)), because the LL will then be completely
>> different.
>> 2. When you use a more complex model (say a GLMM), you can approximate
>> the neg.bin through a two-step procedure: 1. fit a (wrongly structured)
>> glm.nb and extract the theta value from the summary of the model, say
>> theta=4.5 (that is the second parameter of the neg.bin distribution).
>> Then fit the GLMM again, giving as family the argument:
>> negative.binomial(theta=4.5) (again from package MASS). The same holds
>> for GAMs and other models requiring a specification of family.
>> 3. You may want to dig around for books recommending the above
>> procedure. I think I got this as advice from someone else, but haven't
>> bothered yet to look it up (obviously MASS would be a good starting
>> place, in their description of the neg.bin). I saw a paper that does
>> this (using the minimum AIC but otherwise this approach), but it is not
>> a statistical, but rather an ecological paper (although the analyst in
>> the author group is a biometrician whom I full trust): Weigelt, A.,
>> Schumacher, J., Walther, T. Bartelheimer, M., Steinlein, T., Beyschlag,
>> W. (2006) Identifying mechanisms of competition in multispecies
>> communities. Journal of 

Re: [R-sig-eco] using two distance metrices in formula

2009-10-13 Thread Peter Solymos
Dear All,

Perhaps, there is another way of approaching this problem: the
Monmonier's maximum-difference barriers algorithm.

Monmonier, M. (1973) Maximum-difference barriers: an alternative
numerical regionalization method. Geographic Analysis, 3, 245–261.
Manni, F., Guerard, E. and Heyer, E. (2004) Geographic patterns of
(genetic, morphologic, linguistic) variation: how barriers can be
detected by "Monmonier's algorithm". Human Biology, 76, 173–190

Nice documantation and a free software Barrier here:
http://www.mnhn.fr/mnhn/ecoanthropologie/software/barrier.html
The R implementation has been done by Thibaut Jombart in the adegenet
package (function monmonier), that is slightly different in handling
tessellation at the margin. Permutation test are discussed in the
Manni paper as I recall and implemented in Barrier. This is especially
useful for genetic distances (and NOT for biotic data, because of
profoundly different dispersal mechanisms involved in shaping genetic
vs. community structures).

Yours,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")



On Tue, Oct 13, 2009 at 10:07 AM, Jari Oksanen  wrote:
>
>
>
> On 13/10/09 18:44 PM, "Jens Oldeland"  wrote:
>
>> Hi again,
>>
>> our distance matrices are 1) genetic distance (Jaccard) and 2)
>> 3D-Euclidean Distance and the question we want to solve is if there is
>> an effect called "Isolation by Distance" (IBD) in our data (genetic and
>> "real"-distances of snails on the island of crete) or not. There was a
>> debate on the topic if the mantel test or the partial mantel test (isn´t
>> this similar to MRM?) in several papers mainly in evolution-journals:
>>
>> Raufaste, N. and F. Rousset. 2001. Are partial Mantel tests adequate?
>> Evolution 55:1703­1705
>> Castellano, S. and E. Balletto. 2002. Is the partial Mantel test
>> inadequate? Evolution 56:1871­1873.
>>
>>
>> Geffen, E., Anderson, M.J., & Wayne, R.K. (2004). Climate and habitat
>> barriers to dispersal in the highly mobile grey wolf. Molecular Ecology,
>> 13, 2481-2490
>> explain it nicely on page p.2483 (LHS)
>>
>> "The problem arises due to the lack of independence of individual
>> distances in a distance matrix. Although a simple Mantel test overcomes
>> this issue by the
>> use of permutations, a permutational approach does not necessarily solve
>> problems introduced by several uncontrolled nuisance parameters in the
>> case of more than one
>> regressor (i.e. partial tests). Thus, we do not use a Mantel approach
>> here, but rather use the distance-based multivariate approach of McArdle
>> & Anderson (2001). The important point is that, for dbRDA, the
>> individual distances are not treated as a single univariate response
>> variable, as in the Mantel test, but rather the individual sites are the
>> units of observation for analysis, about which we have calculated
>> distances using an entire set of genetic variables. The distance matrix
>> is therefore treated as information regarding multivariate
>> response.Taking this multivariate approach avoids the problems
>> associated with the partial Mantel test."
>
> Jens,
>
> There has been a very similar discussion in the Ecology recently between my
> good friends, Hanna Tuomisto & co vs. Pierre Legendre & co. However, the
> point here and above exactly was that you cannot use dissimilarities on the
> RHS (lack of independence), but you must use rectangular data in dbRDA. If
> you use distances on the RHS you won't have dbRDA but you get Mantel family
> methods (like MRM in ecodist). The problem, of course, is how to map
> distances onto Euclidean space (= rectangular data) *and* still study the
> effects of the distances instead of the effects of *location*. I don't know
> any really good solution here, but all proposed solutions have their
> problems. Pierre Legendre, Daniel Borcard and Hanna Tuomisto have all tried
> to convince me of their point of view, and while all their conflicting
> arguments make sense, they are not yet an optimal solution.
>
> Cheers, Jari Oksanen
>>
>> so we thought it would be a good idea not to use mantel and friends
>> since the problem of IBD seems to need a different approach here.
>>
>> best,
>> Jens
>>
>>
>>
>>
>>
>> Sarah Goslee schrieb:
>>> That doesn't make much sense to me. You'd need an entirely different method
>>> than capscale.
>>>
>>> Perhaps what you're looking for is more like multiple regression on distance
>>> matrices (implemented in MRM in ecodist)?
>>>
>>>      Lichstein, J. 2007. Multiple regression on distance matrices: A
>>>      multivariate spatial analysis tool. Plant Ecology 188: 117-131.
>>>
>>>      Legendre, P.; Lapointe, F. and Casgrain, P. 1994. Modeling brain
>>>      evolution from behavior: A permutational regression approach.
>>>      Evolution 48: 1

Re: [R-sig-eco] null models with continuous abundance data

2010-01-06 Thread Peter Solymos
Dear Etienne,

You can try the Chris Hennig's prablus package which have a parametric
bootstrap based null-model where clumpedness of occurrences or
abundances (this might allow continuous data, too) is estimated from
the site-species matrix and used in the null-model generation. But
here, the sum of the matrix will vary randomly.

But if you have environmental covariates, you might try something more
parametric. For example the simulate.rda or simulate.cca functions in
the vegan package, or fit multivariate LM for nested models (i.e.
intercept only, and with other covariates) and compare AIC's, or use
the simulate.lm to get random numbers based on the fitted model. This
way you can base you desired statistic on the simulated data sets, and
you know explicitly what is the model (plus it is good for continuous
data that you have). By using the null-model approach, you implicitly
have a model by defining constraints for the permutations, and
p-values are probabilities of the data given the constraints (null
hypothesis), and not probability of the null hypothesis given the data
(what people usually really want).

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635



On Wed, Jan 6, 2010 at 2:18 AM, Carsten Dormann  wrote:
> Hi Etienne,
>
> the double constraint is observed by two functions:
>
> swap.web in package bipartite
>
> and
>
> commsimulator in vegan (at least in the r-forge version)
>
> Both build on the r2dtable approach, i.e. you have, as you propose, to turn
> the low values into higher-value integers.
>
> The algorithm is described in the help to swap.web.
>
>
> HTH,
>
> Carsten
>
>
>
> On 06/01/2010 08:55, Etienne Laliberté wrote:
>>
>> Hi,
>>
>> Let's say I have measured plant biomass for a total of 5 species from 3
>> sites (i.e. plots), such that I end with the following data matrix
>>
>> mat<- matrix(c(0.35, 0.12, 0.61, 0, 0, 0.28, 0, 0.42, 0.31, 0.19, 0.82,
>> 0, 0, 0, 0.25), 3, 5, byrow = T)
>>
>> dimnames(mat)<- list(c("site1", "site2", "site3"), c("sp1", "sp2",
>> "sp3", "sp4", "sp5"))
>>
>> Data is therefore continuous. I want to generate n random community
>> matrices which both respect the following constraints:
>>
>> 1) row and column totals are kept constant, such that "productivity" of
>> each site is maintained, and that rare species at a "regional" level
>> stay rare (and vice-versa).
>>
>> 2) number of species in each plot is kept constant, i.e. each row
>> maintains the same number of zeros, though these zeros should not stay
>> fixed.
>>
>> To deal with continuous data, my initial idea was to transform the
>> continuous data in mat to integer data by
>>
>> mat2<- floor(mat * 100 / min(mat[mat>  0]) )
>>
>> where multiplying by 100 is only used to reduce the effect of rounding
>> to nearest integer (a bit arbitrary). In a way, shuffling mat could now
>> be seen as re-allocating "units of biomass" randomly to plots. However,
>> doing so results in a matrix with large number of "individuals" to
>> reshuffle, which can slow things down quite a bit. But this is only part
>> of the problem.
>>
>> My main problem has been to find an algorithm that can actually respect
>> constraints 1 and 2. Despite trying various R functions (r2dtable,
>> permatfull, etc), I have not yet been able to find one that can do
>> this.
>>
>> I've had some kind help from Peter Solymos who suggested that I try the
>> aylmer package, and it's *almost* what I need, but the problem is that
>> their algorithm does not allow for the zeros to move within the matrix;
>> they stay fixed. I want the number of zeros to stay constant within each
>> row, but I want them to move freely betweem columns.
>>
>> Any help would be very much appreciated.
>>
>> Thanks
>>
>>
>
> --
> Dr. Carsten F. Dormann
> Department of Computational Landscape Ecology
> Helmholtz Centre for Environmental Research-UFZ
> Permoserstr. 15
> 04318 Leipzig
> Germany
>
> Tel: ++49(0)341 2351946
> Fax: ++49(0)341 2351939
> Email: carsten.dorm...@ufz.de
> internet: http://www.ufz.de/index.php?de=4205
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] null models with continuous abundance data

2010-01-07 Thread Peter Solymos
simple
>> > algorithm, as well as some code to
>> run a test below. I'd be very interested to
>> > compare it to Vazquez's
>> algorithm!
>>
>> Thanks again
>>
>> Etienne
>>
>> ###
>>
>> nullabun <-
>> > function(x){
>>    require(vegan)
>>    nsites <- nrow(x)
>>    nsp <- ncol(x)
>>
>> > rowsum <- rowSums(x)
>>    colsum <- colSums(x)
>>    nullbin <- commsimulator(x,
>> > method = "quasiswap")
>>    rownull <- rowSums(nullbin)
>>    colnull <-
>> > colSums(nullbin)
>>    rowsum <- rowsum - rownull
>>    colsum <- colsum - colnull
>>
>> > ind <- rep(1:nsp, colsum)
>>    ress <- rep(1:nsites, rowsum)
>>    total <-
>> > sum(rowsum)
>>    selinds <- sample(1:total)
>>       for (j in 1:total){
>>
>> > indsel <- ind[selinds[j]]
>>          sitepot <- which(nullbin[, indsel] > 0)
>>
>> > if (length(ress[ress %in% sitepot]) > 0 ){
>>             sitesel <-
>> > sample(ress[ress %in% sitepot], 1)
>>             ress <- ress[-which(ress ==
>> > sitesel)[1] ]
>>             nullbin[sitesel, indsel] <- nullbin[sitesel, indsel]
>> > + 1
>>            }
>>       }
>> return(nullbin)
>> }
>>
>>
>> ### now let's test the
>> > function
>>
>> # create a dummy abundance matrix
>> m <-
>> > matrix(c(1,3,2,0,3,1,0,2,1,0,2,1,0,0,1,2,0,3,0,0,0,1,4,3), 4,
>> > 6,
>> byrow=TRUE)
>>
>> # generate a null matrix
>> m.null <- nullabun(m)
>>
>> # compare
>> > total number of individuals
>> sum(m) #30
>> sum(m.null) #29: one individual got
>> > "stuck" with no ressources...
>>
>> # to generate more than one null matrix, say
>> > 999
>> nulls <- replicate(n = 999, nullabun(m), simplify = F)
>>
>> # how many unique
>> > null matrices?
>> length(unique(nulls) ) # I found 983 out of 999
>>
>> # how many
>> > individuals in m?
>> sum(m) # there are 30
>>
>> # how bad is the problem of
>> > individuals getting stuck with no ressources
>> in null matrices?
>> sums <-
>> > as.numeric(lapply(nulls, sum, simplify = T))
>> hist(sums) # the vast majority
>> > had 29 or 30 individuals
>>
>>
>> Le jeudi 07 janvier 2010 à 10:34 +0100, Carsten
>> > Dormann a écrit :
>> > Hi Etienne,
>> >
>> > I'm afraid that swap.web cannot easily
>> > accommodate this constraint.
>> > Diego Vazquez has used an alternative approach
>> > for this problem, but I
>> > haven't seen code for it (it's briefly described in
>> > his Oikos 2005
>> > paper). While swap.web starts with a "too-full" matrix and
>> > then
>> > downsamples, Diego starts by allocating bottom-up. There it should be
>> >
>> > relatively easy to also constrain the row-constancy of connectance.
>> >
>> > I
>> > can't promise, but I will hopefully have this algorithm by the end
>> > of next
>> > week (I'm teaching this stuff next week and this is one of the
>> > assignments).
>> > If so, I'll get it over to you.
>> >
>> > The problem that already arises with
>> > swap.web for very sparse matrices
>> > is that there are very few unique
>> > combinations left after all these
>> > constraints. This will be even worse for
>> > your approach, and plant
>> > community data are usually VERY sparse. Once you
>> > have produced the
>> > null models, you should assure yourself that there are
>> > hundreds
>> > (better thousands) of possible randomised matrices. Adding just
>> > one
>> > more constraint to your null model (that of column AND row constancy
>> >
>> > of 0s) will uniquely define the matrix! Referees may not pick it up,
>> > but it
>> > may give you trivial results.
>> >
>> > Best wishes,
>> >
>> > Carsten
>> >
>> >
>> > Vázquez,
>> > D. P., Melián, C. J., Williams, N. M., Blüthgen, N., Krasnov,
>> > B. R., Poulin,
>> > R., et al. (2007). Species abundance and asymmetric
>> > interaction strength in
>> > ecological networks. Oikos, 116, 1120-1127.
>> >
>> > On 06/01/2010 23:57, Etienn

Re: [R-sig-eco] null models with continuous abundance data

2010-01-07 Thread Peter Solymos
l if you want
> to claim significant results based on quantitative null models. Would I be a
> referee or an editor for this kind of ms, I would be very skeptical ask for
> the proof of the validity of the null model.
>
> Cheers, Jari Oksanen
>
>> I don't see this as being a
>> real
> problem with the large matrices I'll deal with though, but
>> definitely
> something to be aware of.
>
> If you're curious, here's the simple
>> algorithm, as well as some code to
> run a test below. I'd be very interested to
>> compare it to Vazquez's
> algorithm!
>
> Thanks again
>
> Etienne
>
> ###
>
> nullabun <-
>> function(x){
>   require(vegan)
>   nsites <- nrow(x)
>   nsp <- ncol(x)
>
>> rowsum <- rowSums(x)
>   colsum <- colSums(x)
>   nullbin <- commsimulator(x,
>> method = "quasiswap")
>   rownull <- rowSums(nullbin)
>   colnull <-
>> colSums(nullbin)
>   rowsum <- rowsum - rownull
>   colsum <- colsum - colnull
>
>> ind <- rep(1:nsp, colsum)
>   ress <- rep(1:nsites, rowsum)
>   total <-
>> sum(rowsum)
>   selinds <- sample(1:total)
>      for (j in 1:total){
>
>> indsel <- ind[selinds[j]]
>         sitepot <- which(nullbin[, indsel] > 0)
>
>> if (length(ress[ress %in% sitepot]) > 0 ){
>            sitesel <-
>> sample(ress[ress %in% sitepot], 1)
>            ress <- ress[-which(ress ==
>> sitesel)[1] ]
>            nullbin[sitesel, indsel] <- nullbin[sitesel, indsel]
>> + 1
>           }
>      }
> return(nullbin)
> }
>
>
> ### now let's test the
>> function
>
> # create a dummy abundance matrix
> m <-
>> matrix(c(1,3,2,0,3,1,0,2,1,0,2,1,0,0,1,2,0,3,0,0,0,1,4,3), 4,
>> 6,
> byrow=TRUE)
>
> # generate a null matrix
> m.null <- nullabun(m)
>
> # compare
>> total number of individuals
> sum(m) #30
> sum(m.null) #29: one individual got
>> "stuck" with no ressources...
>
> # to generate more than one null matrix, say
>> 999
> nulls <- replicate(n = 999, nullabun(m), simplify = F)
>
> # how many unique
>> null matrices?
> length(unique(nulls) ) # I found 983 out of 999
>
> # how many
>> individuals in m?
> sum(m) # there are 30
>
> # how bad is the problem of
>> individuals getting stuck with no ressources
> in null matrices?
> sums <-
>> as.numeric(lapply(nulls, sum, simplify = T))
> hist(sums) # the vast majority
>> had 29 or 30 individuals
>
>
> Le jeudi 07 janvier 2010 à 10:34 +0100, Carsten
>> Dormann a écrit :
>> Hi Etienne,
>>
>> I'm afraid that swap.web cannot easily
>> accommodate this constraint.
>> Diego Vazquez has used an alternative approach
>> for this problem, but I
>> haven't seen code for it (it's briefly described in
>> his Oikos 2005
>> paper). While swap.web starts with a "too-full" matrix and
>> then
>> downsamples, Diego starts by allocating bottom-up. There it should be
>>
>> relatively easy to also constrain the row-constancy of connectance.
>>
>> I
>> can't promise, but I will hopefully have this algorithm by the end
>> of next
>> week (I'm teaching this stuff next week and this is one of the
>> assignments).
>> If so, I'll get it over to you.
>>
>> The problem that already arises with
>> swap.web for very sparse matrices
>> is that there are very few unique
>> combinations left after all these
>> constraints. This will be even worse for
>> your approach, and plant
>> community data are usually VERY sparse. Once you
>> have produced the
>> null models, you should assure yourself that there are
>> hundreds
>> (better thousands) of possible randomised matrices. Adding just
>> one
>> more constraint to your null model (that of column AND row constancy
>>
>> of 0s) will uniquely define the matrix! Referees may not pick it up,
>> but it
>> may give you trivial results.
>>
>> Best wishes,
>>
>> Carsten
>>
>>
>> Vázquez,
>> D. P., Melián, C. J., Williams, N. M., Blüthgen, N., Krasnov,
>> B. R., Poulin,
>> R., et al. (2007). Species abundance and asymmetric
>> interaction strength in
>> ecological networks. Oikos, 116, 1120-1127.
>>
>> On 06/01/2010 23:57, Etienne
>> Laliberté wrote:
>> > Many thanks Carsten and Peter for your suggestions.
>> >
>>
>> > commsimulator indeed respects the two contraints I'

Re: [R-sig-eco] null models with continuous abundance data

2010-01-10 Thread Peter Solymos
sel, indsel]<- nullbin[sitesel, indsel]
>> >>
>> >>> + 1
>> >>>
>> >>             }
>> >>        }
>> >> return(nullbin)
>> >> }
>> >>
>> >>
>> >> ### now let's test the
>> >>
>> >>> function
>> >>>
>> >> # create a dummy abundance matrix
>> >> m<-
>> >>
>> >>> matrix(c(1,3,2,0,3,1,0,2,1,0,2,1,0,0,1,2,0,3,0,0,0,1,4,3), 4,
>> >>> 6,
>> >>>
>> >> byrow=TRUE)
>> >>
>> >> # generate a null matrix
>> >> m.null<- nullabun(m)
>> >>
>> >> # compare
>> >>
>> >>> total number of individuals
>> >>>
>> >> sum(m) #30
>> >> sum(m.null) #29: one individual got
>> >>
>> >>> "stuck" with no ressources...
>> >>>
>> >> # to generate more than one null matrix, say
>> >>
>> >>> 999
>> >>>
>> >> nulls<- replicate(n = 999, nullabun(m), simplify = F)
>> >>
>> >> # how many unique
>> >>
>> >>> null matrices?
>> >>>
>> >> length(unique(nulls) ) # I found 983 out of 999
>> >>
>> >> # how many
>> >>
>> >>> individuals in m?
>> >>>
>> >> sum(m) # there are 30
>> >>
>> >> # how bad is the problem of
>> >>
>> >>> individuals getting stuck with no ressources
>> >>>
>> >> in null matrices?
>> >> sums<-
>> >>
>> >>> as.numeric(lapply(nulls, sum, simplify = T))
>> >>>
>> >> hist(sums) # the vast majority
>> >>
>> >>> had 29 or 30 individuals
>> >>>
>> >>
>> >> Le jeudi 07 janvier 2010 à 10:34 +0100, Carsten
>> >>
>> >>> Dormann a écrit :
>> >>> Hi Etienne,
>> >>>
>> >>> I'm afraid that swap.web cannot easily
>> >>> accommodate this constraint.
>> >>> Diego Vazquez has used an alternative approach
>> >>> for this problem, but I
>> >>> haven't seen code for it (it's briefly described in
>> >>> his Oikos 2005
>> >>> paper). While swap.web starts with a "too-full" matrix and
>> >>> then
>> >>> downsamples, Diego starts by allocating bottom-up. There it should be
>> >>>
>> >>> relatively easy to also constrain the row-constancy of connectance.
>> >>>
>> >>> I
>> >>> can't promise, but I will hopefully have this algorithm by the end
>> >>> of next
>> >>> week (I'm teaching this stuff next week and this is one of the
>> >>> assignments).
>> >>> If so, I'll get it over to you.
>> >>>
>> >>> The problem that already arises with
>> >>> swap.web for very sparse matrices
>> >>> is that there are very few unique
>> >>> combinations left after all these
>> >>> constraints. This will be even worse for
>> >>> your approach, and plant
>> >>> community data are usually VERY sparse. Once you
>> >>> have produced the
>> >>> null models, you should assure yourself that there are
>> >>> hundreds
>> >>> (better thousands) of possible randomised matrices. Adding just
>> >>> one
>> >>> more constraint to your null model (that of column AND row constancy
>> >>>
>> >>> of 0s) will uniquely define the matrix! Referees may not pick it up,
>> >>> but it
>> >>> may give you trivial results.
>> >>>
>> >>> Best wishes,
>> >>>
>> >>> Carsten
>> >>>
>> >>>
>> >>> Vázquez,
>> >>> D. P., Melián, C. J., Williams, N. M., Blüthgen, N., Krasnov,
>> >>> B. R., Poulin,
>> >>> R., et al. (2007). Species abundance and asymmetric
>> >>> interaction strength in
>> >>> ecological networks. Oikos, 116, 1120-1127.
>> >>>
>> >>> On 06/01/2010 23:57, Etienne
>> >>> Laliberté wrote:
>> >>>
>> >>>> Many thanks Carsten and Peter 

Re: [R-sig-eco] multiple regression

2010-02-06 Thread Peter Solymos
Nathan,

Species richness is categorical, so if your richness values are
usually low (say < 20), you should consider the use of Poisson GLM, or
log-transform your response (and log is the canonical link function
for Poisson GLM). This usually improves the model fit. And this might
apply to abundance as well.

If you use lm(), you can inspect the residual variance of the models
after excluding one of the covariates. The increase in residual
variance compared to the full model will tell which variance component
is higher (explains more of your data). Or you may as well inspect the
anova() table of the fitted model (both for lm or glm).

Best,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635



On Sat, Feb 6, 2010 at 9:17 AM, Nathan Lemoine  wrote:
> Hi everyone,
>
> I'm trying to fit a multiple regression model and have run into some
> questions regarding the appropriate procedure to use. I am trying to compare
> fish assemblages (species richness, total abundance, etc.) to metrics of
> habitat quality. I swam transects are recorded all fish observed, then I
> measured the structural complexity and live coral cover over each transect.
> I am interested in weighting which of these two metrics has the largest
> influence on structuring fish assemblages.
>
> My strategy was to use a multiple linear regression. Since the data were in
> two different measurement units, I scaled the variables to a mean of 0 and
> std. dev. of 1. This should allow me to compare the sizes of the beta
> coefficients to determine the relative (but not absolute) importance of each
> habitat variable on the fish assemblage, correct?
>
> My model was lm(Species Richness~Complexity+Coral Cover). I had run a full
> model and found no evidence of interactions, so I ran it without the
> interaction present.
>
> It turns out coral cover was not significant in any regression. I have been
> told that the test I used was incorrect and that the appropriate procedure
> is a stepwise regression, which would, undoubtedly, provide me with
> Complexity as a significant variable and remove Coral Cover. This seems to
> me to be the exact same interpretation as the above model. So, since I'm
> very new to all of this, I am wondering how to tell whether one model is
> 'incorrect' or 'inappropriate' given that they yield almost identical
> results? What are the advantages of a stepwise regression over a standard
> multiple regression like I have run?
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] multiple regression

2010-02-06 Thread Peter Solymos
I meant "Species richness is discrete", not categorical.
Peter

On Sat, Feb 6, 2010 at 12:52 PM, Peter Solymos  wrote:
> Nathan,
>
> Species richness is categorical, so if your richness values are
> usually low (say < 20), you should consider the use of Poisson GLM, or
> log-transform your response (and log is the canonical link function
> for Poisson GLM). This usually improves the model fit. And this might
> apply to abundance as well.
>
> If you use lm(), you can inspect the residual variance of the models
> after excluding one of the covariates. The increase in residual
> variance compared to the full model will tell which variance component
> is higher (explains more of your data). Or you may as well inspect the
> anova() table of the fitted model (both for lm or glm).
>
> Best,
>
> Peter
>
> Péter Sólymos
> Alberta Biodiversity Monitoring Institute
> Department of Biological Sciences
> CW 405, Biological Sciences Bldg
> University of Alberta
> Edmonton, Alberta, T6G 2E9, Canada
> Phone: 780.492.8534
> Fax: 780.492.7635
>
>
>
> On Sat, Feb 6, 2010 at 9:17 AM, Nathan Lemoine  
> wrote:
>> Hi everyone,
>>
>> I'm trying to fit a multiple regression model and have run into some
>> questions regarding the appropriate procedure to use. I am trying to compare
>> fish assemblages (species richness, total abundance, etc.) to metrics of
>> habitat quality. I swam transects are recorded all fish observed, then I
>> measured the structural complexity and live coral cover over each transect.
>> I am interested in weighting which of these two metrics has the largest
>> influence on structuring fish assemblages.
>>
>> My strategy was to use a multiple linear regression. Since the data were in
>> two different measurement units, I scaled the variables to a mean of 0 and
>> std. dev. of 1. This should allow me to compare the sizes of the beta
>> coefficients to determine the relative (but not absolute) importance of each
>> habitat variable on the fish assemblage, correct?
>>
>> My model was lm(Species Richness~Complexity+Coral Cover). I had run a full
>> model and found no evidence of interactions, so I ran it without the
>> interaction present.
>>
>> It turns out coral cover was not significant in any regression. I have been
>> told that the test I used was incorrect and that the appropriate procedure
>> is a stepwise regression, which would, undoubtedly, provide me with
>> Complexity as a significant variable and remove Coral Cover. This seems to
>> me to be the exact same interpretation as the above model. So, since I'm
>> very new to all of this, I am wondering how to tell whether one model is
>> 'incorrect' or 'inappropriate' given that they yield almost identical
>> results? What are the advantages of a stepwise regression over a standard
>> multiple regression like I have run?
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] multiple regression

2010-02-08 Thread Peter Solymos
Dear List,

Thierry's suggestion, to use Binomial(p, N) for modelling species
richness, assumes that the probability of finding a new species (p)
depends e.g. on covaiates (logit(p)=X%*%beta), while different species
share the same probability to be encountered (N independent? trials --
as Alain noted). Because ecological communities rarely have uniform
species-abundance distribution, and species specific probabilities
will probably differ among sites due to different responses to
environmental factors, the Binomial approximation has limited
applicability. And this can be true even for the Poisson. So it turns
out that modeling marginal statistics (total abundance/richness) of
the community matrix requires modeling the communities first...

By the way, Nathan wrote me off list, that he used log transformed
richness, which is the traditional species-area way of handling
richness. He was more interested in variance components, but this
diverged conversation also brought up some interesting views.

Cheers,

Peter



On Mon, Feb 8, 2010 at 6:12 AM, ONKELINX, Thierry
 wrote:
> Dear Gavin,
>
> For many taxonomical groups to total number of species is rather low. 
> Ecologist can either use a fixed total number of species or use some expert 
> knowledge to get the total number of species, taking only large scale effect 
> into account. E.g. freshwater fish not entering seawater, plant species not 
> occuring above a given altitude, ...
>
> The number of absent species is then the total number minus the number of 
> present species.
>
> As long as the number of present species is much smaller than the total 
> number of species, a Poisson distribution seems a reasonable simplification. 
> But what if you are studying a small taxonomical group? Let's assume a group 
> with 10 species and frequently 8 or 9 of them are present. Can assume that 
> the species richness follows a Poisson distribution in that case?
>
> Best regards,
>
> Thierry
>
>
> 
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek
> team Biometrie & Kwaliteitszorg
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> Research Institute for Nature and Forest
> team Biometrics & Quality Assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> tel. + 32 54/436 185
> thierry.onkel...@inbo.be
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no more than 
> asking him to perform a post-mortem examination: he may be able to say what 
> the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does not 
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> -Oorspronkelijk bericht-
> Van: Gavin Simpson [mailto:gavin.simp...@ucl.ac.uk]
> Verzonden: maandag 8 februari 2010 11:14
> Aan: ONKELINX, Thierry
> CC: Peter Solymos; Nathan Lemoine
> Onderwerp: Re: [R-sig-eco] multiple regression
>
> On Mon, 2010-02-08 at 11:02 +0100, ONKELINX, Thierry wrote:
>> Peter,
>>
>> I would think that the species richness is binomial distributed. Since
>> there is a maximum number of species that can be present. Therefore I
>> would model it like
>>
>> glm(cbind(number.present, number.absent) ~ covariates, family =
>> binomial)
>
> Hi Thierry,
>
> How would one estimate number.absent? To my mind, that sounds some what 
> Rumsfeldian...
>
> G
>
>>
>> HTH,
>>
>> Thierry
>>
>> --
>> --
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg
>> Gaverstraat 4 9500 Geraardsbergen Belgium
>>
>> Research Institute for Nature and Forest team Biometrics & Quality
>> Assurance Gaverstraat 4 9500 Geraardsbergen Belgium
>>
>> tel. + 32 54/436 185
>> thierry.onkel...@inbo.be
>> www.inbo.be
>>
>> To call in the statistician after the experiment is done may be no more than 
>> asking him to perform a post-mortem examination: he may be able to say what 
>> the experiment died of.
>> ~ Sir Ronald Aylmer Fisher
>>
>> The plural of anecdote is not data.
>> ~ Roger Brinner
>>
>> The combination of some data and an aching desire for an answer does not 
>> ensure that a reasonable answer can be extracted from a given body of data.
>> ~ John Tukey
>>
>> -Oorspronkelijk bericht-
>> Van: r-sig-ecology-boun...@r-project.org
>&

Re: [R-sig-eco] ZINB or density data models with lots of zeros

2010-03-11 Thread Peter Solymos
Trevor,

You can use weights in the model to provide the surface area (or
sqrt(surface area) to enhance linearity) and leave the counts as they
are in the ZINB model. (In the zeroinfl function weights are used to
weight the log-likelihood and to scale the residuals.)

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635


On Thu, Mar 11, 2010 at 9:03 AM, tavery  wrote:
> I have completed zero-inflation negative binomial (ZINB) models on count
> data for the absolute counts of ectoparasites on fish where there are lots
> of zeros (everything worked well using Zuur et al. and a host of other
> sources). The fish are of different sizes with corresponding differences in
> surface areas of fins etc. and I would now like to compare density of
> parasites among each area. Densities were calculated by dividing the counts
> for each parasite by the surface area of the fin (etc.) and surface areas
> were different for each individual i.e. scaled for size of fish.
>
> The comparisons are then of non-integer values that do not play nice with
> Poisson or Negative Binomial models. However, the issue of having lots of
> zeros remains and will affect mean values if I were to use some sort of
> ANOVA based analysis.
>
> Does anyone have any suggestions on how to deal with the many zeros for the
> density data (assuming I was to use an ANOVA type analysis)? I have also
> thought to just include fish size as a covariate in the ZINB models, but a)
> have not seen an example of such, b) do not want to over complicate the
> analysis, and/or c) this will only scale the counts to overall fish size,
> not to fin etc. surface areas. Of course, fin surface areas probably scale
> linearly and, if so, body size might be an appropriate covariate to remove
> the effect from the ZINB comparisons of counts on fins i.e. essentially the
> same comparison (density = ZINB with fish size covariate). Does that make
> sense?
>
> This ZINB model works well (yes, the 'false' zeros are independent of
> factors hence the "| 1"). Where would I insert the fish size covariate?
> location_on_body = where parasites were found (7 body areas)
> location = site of fish capture (2 sites)
>
> zinb2<-zeroinfl(caligus_elongatus~location_on_body+location | 1,
> dist="negbin", link="logit", data=sturg)
>
> I would assume this:
>
> zinb2<-zeroinfl(caligus_elongatus~fish_size+location_on_body+location | 1,
> dist="negbin", link="logit", data=sturg)
>
> thanks in advance,
> trevor
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] ZINB or density data models with lots of zeros

2010-03-11 Thread Peter Solymos
Scott,

thanks for pointing out the Hall paper. BTW the Negative Binomial
model itself is a kind or random intercept model, where the random
effect is not Normal on the linear predictor scale, but Gamma on the
response scale. But you are right, it can be tricky to define more
complicated random effects in R for zero-inflated data, or at least I
am not aware of any instant solutions.

Trevor: you can try transforming the surface area for the weights, but
if you are interested in prediction, the best way could be to simply
take the covariate model, that is something like an abundance-area
relationship.

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635



On Thu, Mar 11, 2010 at 2:25 PM, Scott Foster  wrote:
> Hi,
>
> I, and many others, would not use the density as the outcome.  If you are
> using the log-link for the NB part of the model then it is appropriate to
> use log( size) as an offset.  This is because
>
> log( E( count / size)) = log( E( count) / size); as expectation is w.r.t.
> counts, not size
>                                = log( E( count)) - log( size)
>                                = linear predictor
>
> So
>
> log( E( count)) = linear predictor + log( size).
>
> This can be done in R by simply appending +offset( log( size)) in your model
> formula for the NB model.  This essentially adds a term to your linear
> predictor with a set coefficient of 1.  If you think that this may not be
> appropriate then you can add it as a covariate (as Peter just suggested).
>
> Just a cautionary note:  If you start to look at counts within particular
> areas of a fish (i.e. test if more parasites are on fins or on body or on
> ...) then you will start to require nested models, ala mixed effects.  After
> all, if a fish is infected then it is(?) more likely to have parasites in
> all areas (and vice versa for a fish that isn't infected).  I've not used
> random effects in a zero inflated model but I have seen a paper that
> includes random effects in zero-inflated models (Hall 2000).  I'm not aware
> of any software that will do this automatically for you.  Someone else might
> know...
>
> If you really *must* use the ratio as an outcome and do not use the offset
> approach then you are left with two options: delta approaches (often called
> hurdle models), or Tweedie type models.  I won't say any more now as I
> suspect that this is probably not exactly what you want to be doing (I do
> have particular opinions though).
>
> HTH,
>
> Scott
>
> Hall, D.B. Zero-Inflated Poisson and Binomial Regression with Random
> Effects: A Case Study.  Biometrics 56, 1030-1039.
>
>
> Peter Solymos wrote:
>>
>> Trevor,
>>
>> You can use weights in the model to provide the surface area (or
>> sqrt(surface area) to enhance linearity) and leave the counts as they
>> are in the ZINB model. (In the zeroinfl function weights are used to
>> weight the log-likelihood and to scale the residuals.)
>>
>> Cheers,
>>
>> Peter
>>
>> Péter Sólymos
>> Alberta Biodiversity Monitoring Institute
>> and Boreal Avian Modelling project
>> Department of Biological Sciences
>> CW 405, Biological Sciences Bldg
>> University of Alberta
>> Edmonton, Alberta, T6G 2E9, Canada
>> Phone: 780.492.8534
>> Fax: 780.492.7635
>>
>>
>> On Thu, Mar 11, 2010 at 9:03 AM, tavery  wrote:
>>
>>>
>>> I have completed zero-inflation negative binomial (ZINB) models on count
>>> data for the absolute counts of ectoparasites on fish where there are
>>> lots
>>> of zeros (everything worked well using Zuur et al. and a host of other
>>> sources). The fish are of different sizes with corresponding differences
>>> in
>>> surface areas of fins etc. and I would now like to compare density of
>>> parasites among each area. Densities were calculated by dividing the
>>> counts
>>> for each parasite by the surface area of the fin (etc.) and surface areas
>>> were different for each individual i.e. scaled for size of fish.
>>>
>>> The comparisons are then of non-integer values that do not play nice with
>>> Poisson or Negative Binomial models. However, the issue of having lots of
>>> zeros remains and will affect mean values if I were to use some sort of
>>> ANOVA based analysis.
>>>
>>> Does anyone have any suggestions on how to deal with the many zeros for
>>> the
>>> densit

Re: [R-sig-eco] Species CoOccurance

2010-03-12 Thread Peter Solymos
Lanna,

I don't know exactly what do you mean by co-occurrence data frame, but
if you'd like to get a species-by-species matrix, in which you count
the co-occurrences of the species (columns in the sites-by-species
community matrix) with each other, you can use the crossprod function
or the %*% operator for a presence/absence matrix:

m <- matrix(rbinom(15,1,0.6),5,3)
t(m) %*% m
crossprod(m, m)

HTH,

Peter


On Fri, Mar 12, 2010 at 7:22 AM, Lanna Jin  wrote:
> Hello All,
>
> I've been sifting through tons of R Documentation, trying to find a command
> that converts a community matrix dataset into a species pair co-occurance
> data frame, and have failed miserably to find one.  Does a function already
> exist in one of the community analysis packages?
>
> Thanks in advance for your response!
>
> --
> Lanna Jin
>
> lanna...@gmail.com
> 510-898-8525
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Handling lots of zero + spatial autocorrelation

2010-03-25 Thread Peter Solymos
Claudia,

Here is a more specific paper with hierarchical Bayesian model:

Stephen L. Rathbun and Songlin Fei
A spatial zero-inflated poisson regression model for oak regeneration
Environmental and Ecological Statistics, 2006, 13: 409-426
http://www.springerlink.com/content/r327264t016x2873/

I am not sure if the zero probability part is fully identifiable, but
if you use a constant probability model for the zeros, that might work
out even better. (The paper comes with no R implementation.)

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635



2010/3/25 Etienne Laliberté :
> Hi Claudia,
>
> I think you'll find the following article most useful:
>
> Dormann, C. F., J. M. McPherson, M. B. Araujo, R. Bivand, J. Bolliger,
> G. Carl, R. G. Davies, A. Hirzel, W. Jetz, W. Daniel Kissling, I. Kuhn,
> R. Ohlemuller, P. R. Peres-Neto, B. Reineking, B. Schroder, F. M.
> Schurr, et R. Wilson. 2007. Methods to account for spatial
> autocorrelation in the analysis of species distributional data: a
> review. Ecography 30:609-628.
>
> Cheers
>
> Etienne
>
> Le jeudi 25 mars 2010 à 12:00 +0100, r-sig-ecology-requ...@r-project.org
> a écrit :
>> Send R-sig-ecology mailing list submissions to
>>       r-sig-ecology@r-project.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>       https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>> or, via email, send a message with subject or body 'help' to
>>       r-sig-ecology-requ...@r-project.org
>>
>> You can reach the person managing the list at
>>       r-sig-ecology-ow...@r-project.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of R-sig-ecology digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Handling lots of zero + spatial autocorrelation
>>       (Claudia liliana Ballesteros Mejia)
>>
>>
>> --
>>
>> Message: 1
>> Date: Wed, 24 Mar 2010 12:28:01 -0700 (PDT)
>> From: Claudia liliana Ballesteros Mejia 
>> To: R-list 
>> Subject: [R-sig-eco] Handling lots of zero + spatial autocorrelation
>> Message-ID: <344339.76193...@web112706.mail.gq1.yahoo.com>
>> Content-Type: text/plain
>>
>> Dear list,
>>
>> I'm trying to model the distribution of records that I have collected in my 
>> study area affected by parameters as population density and access. I have 
>> non normal distribution of the data,  lots of zeros and an important level 
>> of spatial autocorrelation. So, I wonder if there is a way in R to handle at 
>> the same time the 3 problems, because so far, I've tried the zero inflation 
>> poisson regression but it doesn't take into account the spatial 
>> autocorrelation, and gls, but it doesn't solve my problem of the zeros and 
>> nonnormal distribution.
>>
>> I would appreciate any hint,
>>
>> Liliana.
>>
>>
>>
>>
>>       [[alternative HTML version deleted]]
>>
>>
>>
>> --
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>>
>> End of R-sig-ecology Digest, Vol 24, Issue 12
>> *
>
>
> --
> Etienne Laliberté
> 
> School of Forestry
> University of Canterbury
> Private Bag 4800
> Christchurch 8140, New Zealand
> Phone: +64 3 366 7001 ext. 8365
> Fax: +64 3 364 2124
> www.elaliberte.info
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] vegdist producing empty result object

2010-04-07 Thread Peter Solymos
Dave,

The vegdist function of the vegan package produces an object of class
"dist" similarly to the dist function in stats. It can be converted
into a symmetric matrix by as.matrix(x). The help page of dist will
give you details about the structure of "dist" objects.

Cheers,

Peter



On Wed, Apr 7, 2010 at 12:02 PM, Chagaris, Dave  wrote:
> I have a dataframe of abundances with 385 rows (samples) and 82 columns 
> (species).  I am trying to compute the Bray-Curtis dissimilarity between 
> samples.  For some reason, the resulting distance object output by vegdist is 
> empty and when I dim the object the result is NULL.  I don't get any error 
> messages.  The same dataset works in matlab and  Primer.
>
> Does anybody know what might be causing this?  The output is below.  Thanks.
>
>>   dim(guts)
> [1] 385  82
>
>>   disBray <- vegdist(guts, method="bray")
>
>>   dim(disBray)
> NULL
>
>>   is.matrix(disBray)
> [1] FALSE
>
> Dave
>
>
> David Chagaris
> Associate Research Scientist
> Florida Fish and Wildlife Conservation Commission
> Florida Fish and Wildlife Research Institute
> 100 8th Ave SE
> St. Petersburg, FL  33701
> (727) 896-8626 ext. 4305
> (727) 893-1374 fax
>
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Vegan fisher.alpha error

2010-06-01 Thread Peter Solymos
Kang Min,

The error comes from the function 'fisherfit' that uses 'nlm' to
minimize the negative log likelihood for the Fisher's log-series.
Numerical optimization does not tolerate missing values. This code
reproduces your error:

> library(vegan)
> data(BCI)
> BCI[1,1] <- NA
> fisher.alpha(BCI)
Error in nlm(Dev.logseries, n.r = n.r, p = p, N = N, hessian = TRUE, ...) :
  missing value in parameter

So try to exclude problematic rows:

> fisher.alpha(BCI[rowSums(is.na(BCI))==0,])

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")


On Tue, Jun 1, 2010 at 6:20 AM, Kang Min  wrote:
>
> Hi,
>
> I have an error with fisher.alpha from the vegan package.
>
>> fisher.alpha(data[[1]])
> Error in nlm(Dev.logseries, n.r = n.r, p = p, N = N, hessian =
> TRUE, ...) :
>  missing value in parameter
>
> I am trying to find fisher alpha for a list of 100 data frames, and I
> tried it on individual data frames in the list, which gave me the
> error above.
> I have every data frame in the same format as the example data BCI,
> species as column names and quadrats as row names.
>
> Can anyone explain what the error is about? Thanks a lot.
>
> Kang Min
> --
> View this message in context: 
> http://r-sig-ecology.471788.n2.nabble.com/Vegan-fisher-alpha-error-tp5125719p5125719.html
> Sent from the r-sig-ecology mailing list archive at Nabble.com.
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] repeated measures with multivariate data

2010-07-23 Thread Peter Solymos
Kay,

using strata (restricting permutations within time points, and not
within locations) in adonis makes some sense, given that permutation
tests assume independence. But that does not solve the problem of
dependence, but it is a good starter. If you have a before-after
control-treatment design, you might want to use the absolute
difference between observations and do the adonis on the matrix of
differences (the dissimilarity then will represent distance in terms
of absolute deviation between time points). However, this won't show
the sign of the change, which might be of interest. If you have larger
number of repeated observations, the MARSS package is just appeared on
CRAN to fit multivariate autoregressive state-space models using a
Kalman-filter approach.

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")
http://www.abmi.ca
http://sites.google.com/site/psolymos



On Fri, Jul 23, 2010 at 1:02 PM, Kay Cichini  wrote:
>
> dear list,
>
> i searched though the archives and the question of repeated measure analysis
> with species data was often brought up but, as far as i found, never clearly
> worked out.
>
> if i have, say two time points at wich i sampled replicated plots and thus
> had replicated paired mutivariate obervations, how to deal with them
> properly.
>
> if i used adonis, to restrict permutations on the pairs of plots (as this
> would be done in an univariate permutation test for paired observations)
> does not make sense to me but i'm also dubious about neglecting the
> dependence of the pairing.
>
> any ideas would be highly appreciated,
> kay
>
> -
> 
> Kay Cichini
> Postgraduate student
> Institute of Botany
> Univ. of Innsbruck
> 
>
> --
> View this message in context: 
> http://r-sig-ecology.471788.n2.nabble.com/repeated-measures-with-multivariate-data-tp5330819p5330819.html
> Sent from the r-sig-ecology mailing list archive at Nabble.com.
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] repeated measures with multivariate data

2010-07-25 Thread Peter Solymos
Hi Kay,
I meant to make permutations within time points, i.e. strata=time, and
not within locations (strate=locations). adonis do F-tests based on
sequential sums of squares from permutations, thus the non
independence of repeated measures can have an effect on p-values
associated with terms subsequent to the time variable.
Cheers,
Peter



On Sat, Jul 24, 2010 at 2:18 PM, Kay Cichini  wrote:
>
> hello peter,
>
> thanks for replying.
> if we stay with the before after design, where i only want to know whether
> multivariate response changed or not, you say it's ok to restrict
> permutations on pairs. but then why it does not solve the problem of
> dependence?
>
> yours,
> kay
>
> -
> 
> Kay Cichini
> Postgraduate student
> Institute of Botany
> Univ. of Innsbruck
> 
>
> --
> View this message in context: 
> http://r-sig-ecology.471788.n2.nabble.com/repeated-measures-with-multivariate-data-tp5330819p5333603.html
> Sent from the r-sig-ecology mailing list archive at Nabble.com.
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] negative F value in adonis()

2010-08-10 Thread Peter Solymos
Adriano,
what you have reported is strange, however, (1) it is not clear what
the strata 'bloco' is and if it can cause the problem (i.e. too
restrictive permutation scheme, or the strata has something to do with
your independent variable 'trat'), and (2) you are not using Sorensen
index "sor" = 2*a/(2*a+b+c) but  "sim" = pmin(b,c)/(pmin(b,c)+a).
After the presence-absence transformation the default Bray-Curtis
setting of 'adonis' should give Sorensen also.
Cheers,
Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
http://sites.google.com/site/psolymos



On Mon, Aug 9, 2010 at 4:21 PM, Adriano Melo  wrote:
> Dear colleagues,
> I and a graduate student (Fabiana Schneck) are using the adonis() function in 
> vegan to test for the effect of two treatments (homogenous and heterogenous 
> substrate) on diatom species composition. Data resulted from a experiment 
> carried out in 11 stream reaches. Data were obtained in six visits after 
> colonization ceased and we opted to use the 66 combination of reaches and 
> time as block (strata in the argument list of the function). We were able to 
> run the function using the default 'bray' method. However, we get negative 
> F-values when using a dissimilarity matrix obtained from simpson formulae. 
> This is our code and result:
> bio.quali<-ifelse(bio>0, 1, 0)bio.quali.simp<-betadiver(bio.quali, 
> "sim")adonis(bio.quali.simp~trat, strata=bloco, permutations = )
>
> Call:
> adonis(formula
> = bio.quali.simp ~ trat, permutations = ,      strata = bloco)
>
>                               Df       SumsOfSqs    MeanSqs   F.Model   R2    
>  Pr(>F)
> trat         1.00  -0.587663  -0.587663  -8.821903 -0.0728      1
> Residuals  130.00  8.659838  0.066614  1.0728
>
> Total      131.00  8.072174  1.
>
>
>
>
> The reasoning to use the simpson dissimilarity was to assess whether 
> differences among treatments observed with Sorensen (Bray-Curtis for p/a 
> data) was due to differences in species richness (the simpson index is 
> supposed to measure turnover and be robust to differences in species 
> richness: Lennon et al. 2001: J.An.ECol 70:966-79).
> It is hard to imagine a negative F-value in a traditional univariate 
> analysis, but I am not sure whether this is possible in adonis().
> Adriano Sanches Melo
>
> Departamento de Ecologia - ICB
>
> Universidade Federal de GoiásBrazil
>
>
>
>        [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Logistic regression plot

2010-08-18 Thread Peter Solymos
Manuel,

it depends on whether you are interested (1) only in mean predictions
only or (2) prediction intervals as well. In the first case, this will
give you mean predictions:

x1 <- seq(min1, max1, len=25)
x2 <- seq(min2, max2, len=25)
x3 <- seq(min3, max3, len=25)
x10 <- x20 <- x30 <- rep(0, 25)
beta.hat <- c(beta0, beta1, beta2, beta3)
X <- model.matrix(~x1+x20+x30)
Y.hat <- plogis(X %*% beta.hat) # inverse logit link

and then repeat with ~ x10 + x2 + x30 etc. to control for other covariates.

In the second case, either you write a program that calculates the
distribution of Y.hat based on varying beta.hat, that is generating
parameters from a normal distribution with beta.i mean and SE.i
standard deviation. Or do the same in WinBUGS/OpenBUGS/JAGS to
generate from the posterior distribution (note that you should use the
precision = 1/variance in the prior specification). This won't be the
*joint* posterior because you don't have the variance-covariance
matrix, but should work as an approximation.

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")
http://www.abmi.ca
http://sites.google.com/site/psolymos



On Tue, Aug 17, 2010 at 8:15 AM, Manuel Spínola  wrote:
>  Dear List members,
>
> How can I plot in R the result from a logistic regression model with 3
> continuous explanatory variable when I only have the model output
> (coefficients and SE) and the explanatory variable range values.  I don't
> have the var-cov matrix.
> I will like to make 3 plots, one for each variable in the model.
> Thank you very much in advance.
> Best,
>
> Manuel
>
> --
> Manuel Spínola, Ph.D.
> Instituto Internacional en Conservación y Manejo de Vida Silvestre
> Universidad Nacional
> Apartado 1350-3000
> Heredia
> COSTA RICA
> mspin...@una.ac.cr
> mspinol...@gmail.com
> Teléfono: (506) 2277-3598
> Fax: (506) 2237-7036
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] hurdle model

2010-08-19 Thread Peter Solymos
Dear All,

I had a quick look at the internal functions used by pscl::hurdle to
do the numerical optimization by optim. It clearly corresponds to the
hurdle model defined in the paper/vignette, where the zero component
is based on a right censored random variable, that is 0 if the
original count data is 0 and 1 otherwise. The likelihood function for
the zero model corresponds to a censored Poisson model. The count
estimation part is based on left truncated Poisson. This is a
conditional inference thinking, the zero model tells us what
determines if the data is 0 or >0, and once the observations are >0,
than what determines the exact count. If estimates from the 2 models
are identical, it means that 0s can arise from the same Poisson
distribution as the counts. So this is not really a mixture as it is
the case with the zeroinfl() model. The resulting log-likelihood is
still valid, and table 1 clearly states that the hurdle model is based
on ML (maximum likelihood).

It is not the same estimating procedure as for the quasipoisson, where
a likelihood-like function is used to get estimates (note that
parameter estimates are the same as for Poisson, but SEs and the
dispaersion parameter are different).

To handle other overdispersion than zero inflation, one can choose NB
instead of Poisson in hurdle. The quasipoisson family is not allowed
there.

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")
http://www.abmi.ca
http://sites.google.com/site/psolymos



On Thu, Aug 19, 2010 at 7:22 AM, Jari Oksanen  wrote:
> On Thu, 2010-08-19 at 14:54 +0300, Gavin Simpson wrote:
>> On Thu, 2010-08-19 at 13:20 +0200, Yingjie Zhang wrote:
>
>> They fit several models and compare them:
>>
>>      I. Poisson
>>     II. Negative Binomial
>>    III. Quasi-likelihood
>>     IV. Hurdle model
>>      V. zero-inflated model
>>
>> III should be a quasi-poisson model, i.e. you fit the Poisson GLM
>> using
>> quasi-likelihood and model the dispersion parameter \phi alongside the
>> usual Poisson GLM parameters.
>>
>> Section 2.3 of their paper on the hurdle model doesn't even mention
>> "quasi". Though they do mention this in Table2.
>>
>> Reading this, I think they cooked this model themselves - you can fit
>> a
>> binomial model yourself for the presence absence and then fit a count
>> model for the samples predicted to be present from the binomial part.
>> To
>> make things simple I suspect they fitted the count part as
>> quasi-Poisson
>> but no-where does it say exactly what they did.
>
> I know that at least Jane Elith has an email address (I have used it
> years ago), so you could ask her. However, it may be  that their hurdle
> model uses just Poisson, and there is a minor mistake in their Table 2.
>
> You can use quasipoisson() or poisson() in glm() in a very natural way:
> the fitting happens via iteratively reweighted least squares, and all
> you need to define is the relationship between fitted values and
> variance. If you look at poisson() and quasipoisson() functions in R
> (these provide the backbone of the glm(..., family=)), you see that the
> differences are that quasipoissoin()$aic() always returns NA, and
> quasipoisson() lacks item simulate(). Otherwise they work in a similar
> way. Except in poisson() you take the scale (\phi) to be 1, and in
> quasipoisson() you estimate the scale from the fitted model. Then you
> just multiply standard errors with the scale, use F tests instead of
> Chisq in anova() etc.
>
> I am not sure (or actually, I don't think) that this fitting parallelism
> extends to *truncated* Poisson that is used in pscl::hurdle(). Although
> you can do fitting by stages, and fit quasipoisson() glm for above-zero
> values, I don't think this is the correct thing to do when you are not
> allowed to have new zeros. However, the truncated poisson likelihood
> model is a huge improvement over hand-fitting glm with iteratively
> reweighted least squares and assuming constant variance/fit
> relationship.
>
> If you are worried about the overdispersion of the above-zero count
> data, use the truncated negative binomial model offerred by
> pscl::hurdle(). It is designed for the purpose (and has a more exciting
> narrative for ecologists).
>
> Cheers, Jari Oksanen
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] p-Function for Empirical Distributions

2010-09-04 Thread Peter Solymos
On Fri, Sep 3, 2010 at 11:32 PM, Jane Shevtsov  wrote:
> Does R have a p-function for empirical distributions? In other words,
> how can I find out what fraction of the values in my data set are
> smaller than a given value?

Maybe
sum(x <= crit) / length(x)
Cheers,
Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
email <- paste("solymos", "ualberta.ca", sep = "@")

>
> --
> -
> Jane Shevtsov
> Ecology Ph.D. candidate, University of Georgia
> co-founder, 
> Check out my blog, Perceiving Wholes
>
> "The whole person must have both the humility to nurture the
> Earth and the pride to go to Mars." --Wyn Wachhorst, The Dream
> of Spaceflight
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Distinctiveness and contribdiv( )

2010-12-01 Thread Peter Solymos
Burak,
Not exactly clear what you want, but Lu et al. 2007 describes the
differentiation coefficient D that is the sum(beta)/sum(gamma) and can
be obtained as attributes(contribdiv(x))$diff.coef . Note also that
there is a link between diversity partitions and distance indices so
you might as well use the components from the contribution diversity
approach. Or is it something like irreplaceability in the reserve
selection literature you need?
Cheers,
Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")



On Wed, Dec 1, 2010 at 11:17 AM, Pekin, Burak K  wrote:
> Hello, Is there a diversity index similar to the 'contribution diversity 
> approach' contribdiv()in vegan that reflects only  the compositional 
> 'distinctiveness' of a site within a region. contribdiv()combines 
> 'distinctiveness' with a within-site diversity measure such as richness, I on 
> the other hand am looking for an index that measures only the relative number 
> of species within a site that are 'distinct' or 'unique' within a region.
>
> Cheers,
> Burak
>
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Distinctiveness and contribdiv( )

2010-12-01 Thread Peter Solymos
Burak,

you can use contribdiv(x, scaled = TRUE) which in turn gives you C
values that are the relative contribution of diversity partitions. "A
positive value means that the contribution of a specific unit to
within-unit, among-unit or total species richness of the region is
higher than the average of all units, whereas a negative value
indicates a below average contribution." (quote from Lu et al,'s paper
cited in the ?contribdiv page). For further details and
interpretation, you should consult the paper:

Lu, H. P., Wagner, H. H. and Chen, X. Y. 2007. A contribution
diversity approach to evaluate species diversity. Basic and Applied
Ecology, 8, 1–12.

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")



On Wed, Dec 1, 2010 at 12:34 PM, Pekin, Burak K  wrote:
> Yes Peter, I am looking for something that really conveys irreplaceability, 
> in other words, the relative 'endemism' of species within a site. So, a site 
> that contains species which are found only in a few other sites within a 
> region is considered more 'irreplaceable' or 'distinct' than a site that 
> contains more 'generalized' species which are found in many sites within the 
> region, irrespective of the total number of species found within them. Does 
> the beta component of contribdiv ( ) do this, or does it also take into 
> account the total number of species within the sites?
> -Burak
>
> -Original Message-
> From: psoly...@gmail.com [mailto:psoly...@gmail.com] On Behalf Of Peter 
> Solymos
> Sent: Wednesday, December 01, 2010 1:36 PM
> To: Pekin, Burak K
> Cc: r-sig-ecology@r-project.org
> Subject: Re: [R-sig-eco] Distinctiveness and contribdiv( )
>
> Burak,
> Not exactly clear what you want, but Lu et al. 2007 describes the 
> differentiation coefficient D that is the sum(beta)/sum(gamma) and can be 
> obtained as attributes(contribdiv(x))$diff.coef . Note also that there is a 
> link between diversity partitions and distance indices so you might as well 
> use the components from the contribution diversity approach. Or is it 
> something like irreplaceability in the reserve selection literature you need?
> Cheers,
> Peter
>
> Péter Sólymos
> Alberta Biodiversity Monitoring Institute and Boreal Avian Modelling project 
> Department of Biological Sciences CW 405, Biological Sciences Bldg University 
> of Alberta Edmonton, Alberta, T6G 2E9, Canada
> Phone: 780.492.8534
> Fax: 780.492.7635
> email <- paste("solymos", "ualberta.ca", sep = "@")
>
>
>
> On Wed, Dec 1, 2010 at 11:17 AM, Pekin, Burak K  wrote:
>> Hello, Is there a diversity index similar to the 'contribution diversity 
>> approach' contribdiv()in vegan that reflects only  the compositional 
>> 'distinctiveness' of a site within a region. contribdiv()combines 
>> 'distinctiveness' with a within-site diversity measure such as richness, I 
>> on the other hand am looking for an index that measures only the relative 
>> number of species within a site that are 'distinct' or 'unique' within a 
>> region.
>>
>> Cheers,
>> Burak
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Converting species counts into species list

2010-12-06 Thread Peter Solymos
Ophelia and Others,

A more general solution is:

## define a function (that is part of the mefa package)
`rep.data.frame` <- function(x, ...)
as.data.frame(lapply(x, rep, ...))
## example from ?mefa:::rep.data.frame
x <- data.frame(sample = LETTERS[c(1,1,2,2,3)],
species = letters[c(5,5,5,6,7)],
count = c(1,2,10,3,4), segment = letters[c(8,9,8,9,8)])
x
rep(x[,c(1,2,4)], times = x[,3])
rep(x[,c(1,2,4)], each = 2)

Cheers,

Peter


On Mon, Dec 6, 2010 at 5:07 AM, zhangjl  wrote:
> Dear Ophelia,
>     If your dataframe is already in memory, simply write
>    as.character(rep(dat[,1],dat[,2]) ,
> to finish the task.
>
> Aegiinte        3
> Agonperu      3
> Aiouimpr       2
> Albiniop        2
> Albisubd       1
> ..
> else, you may try the following code, if your data is saved on the disk.
>
> dat <- read.csv("dat2.csv")
> writeLines(as.character(rep(dat[,1],dat[,2])), "Lines.csv")
>
> Hope these commands will help.
>
> best,
> Jinlong
>
>
> 2010-12-06
>
>
>
> Jinlong Zhang
> Ph.D.Candidate
> State Key Laboratory for Vegetation and Environmental Change,
> Institute of Botany, The Chinese Academy of Sciences
> Nanxincun 20,Xiangshan, Beijing 100093
> E-mails:
> zhan...@ibcas.ac.cn
> jinlongzhan...@gmail.com
>
>
>
>
> 发件人: Ophelia Wang
> 发送时间: 2010-12-06  16:51:21
> 收件人: r-sig-ecology
> 抄送:
> 主题: [R-sig-eco] Converting species counts into species list
>
> Hello,
> I have a list of species and abundance that looks like this:
> Aegiinte        3
> Agonperu      3
> Aiouimpr       2
> Albiniop        2
> Albisubd       1
> ..
> And I need to convert these two columns into a column in which each
> cell represents an individual of a articular species:
> Aegiinte
> Aegiinte
> Aegiinte
> Agonperu
> Agonperu
> Agonperu
> Aiouimpr
> Aiouimpr
> Albiniop
> Albiniop
> Albisubd
> .
> Can someone help me to develop a script to do the conversion?
> Thanks a lot!
> Ophelia
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>        [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] How to change the pattern of my dataset??

2010-12-21 Thread Peter Solymos
Yong Zhang,

I think this is what you are looking for:

library(mefa)
x <- matrix(c(
   2,  3,  4, 2,
   5,  6,  5, 2,
   4,  3,  4, 5,
   4,  5,  4, 1), 4, 4, byrow=TRUE)
colnames(x) <- c("site1",  "site2", "site3", "site4")
rownames(x) <- c("A",  "B", "C", "D")
melt(as.mefa(t(x)))

HTH,

Peter

ps: just had seen a wonderful lunar eclipse at winter solstice, once
in every 372 years...



2010/12/21 Yong Zhang <2010202...@njau.edu.cn>:
> Dear all
>
> I have to appologize in advance , because I am very new about this. Thing is 
> like this, now I have a dataset containing species abundance, species names 
> and sampling sites. See below:
>
>           site1  site2 site3 site4.
> A           2      3      4     2
> B           5      6      5     2
> C           4      3      4     5
> D           4      5      4     1
> .
> .
> .
> .
> .
> .
> .
> .
> How should I do to change it into a dataset like below:
> site1 A 2
> site1 B 5
> site1 C 4
> site1 D 4
> .
> .
> .
> .
> .
> .
>
> site2 A 3
> site2 B 6
> site2 C 3
> site2 D 5
> .
> .
> .
> .
> .
> .
> site3 A 4
> site3 B 5
> site3 C 4
> site3 D 4
> .
> .
> .
> .
> .
> site4 A 2
> site4 B 2
> site4 C 5
> site4 D 1
> .
> .
> .
> .
> .
> .
> .
>
> Could one know which package should I take?  And, which function shou I use? 
> Please answer in details if you can. Thanks very much for you time and 
> suggestion.
>
> All the best.
>
>
>
>
> Yong Zhang, Ph.D.
> Lab of aquatic insects & stream ecology
> Dept.of Entonology, Nanjing Agricultural University
> Nanjing, 210095,China
> Phone number:  (+86) -25-84395241
> E-mail:2010202...@njau.edu.cn
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] omit levels from boxplots

2011-01-06 Thread Peter Solymos
Hi,

for one factor, it is enough to do

dat$ageclass <- dat$ageclass[drop=TRUE]

the 'purgef' function applies the drop statement for each columns in
the data frame and eventually returns a list because that's how
'lapply' works. If you want a data frame in the end, you can either do
(as I recall lists also work supplied as data argument, but just to be
consistent)

dat[] <- lapply(dat, function(x) x[drop=TRUE])

or modify the function as

purgef2 <- function(x) as.data.frame(lapply(x, function(x) x[drop = TRUE]))

Let's see for example:

dat <- data.frame(ageclass=factor(1:2, levels=1:4), sex=factor("male",
levels="male"))
purgef <- function(x) lapply(x, function(x) x[drop = TRUE])
purgef(dat)

$ageclass
[1] 1 2
Levels: 1 2

$sex
[1] male male
Levels: male

purgef2(dat)

  ageclass  sex
11 male
22 male

But maybe Ben's update is just enough (it came in as I was writing this reply).

Cheers,

Peter



On Thu, Jan 6, 2011 at 10:16 AM, Jarrett Byrnes  wrote:
> Ah, so, you have factors that are no longer present in the data.  I use a 
> quick function I wrote called purgef to get rid of zombie factors.
>
> purgef<-function(x) lapply(x, function(x) x[drop = TRUE])
>
>
> So, here, before your boxplot
>
> dat$ageclass <- purgef(dat$ageclass)
>
> On Jan 6, 2011, at 9:09 AM, Howe, Eric (MNR) wrote:
>
>> Good day list members,
>>
>>
>>
>> I was wondering if anyone knows a way to omit specific columns or anova
>> cells from boxplots.
>>
>>
>>
>> E.g. I produce a boxplot as:
>>
>>
>>
>>> boxplot(y ~ ageclass + sex,data=dat)
>>
>>
>>
>> where ageclass has 4 levels (including 1-year-old), and sex has 2
>> levels, but the only males included in the analysis are one-year-olds.
>>
>> The resulting boxplot illustrates differences among age classes of
>> females, and between male and female one-year-olds.
>>
>> However, it has 3 blank columns, corresponding to subadult, young adult,
>> and mature age classes of males.
>>
>> I'd like to exclude those blank columns from the figure.
>>
>>
>>
>> Thanks in advance,
>>
>>
>>
>> Eric
>>
>>
>>
>>
>>
>>
>>       [[alternative HTML version deleted]]
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] number of observations used in scatterplot.matrix()

2011-01-21 Thread Peter Solymos
Maria,

You can have number of complete pairs for each column pair combination as

(x <- matrix(c(NA,NA,NA,1,2,3), 2, 3))
(x.na <- !is.na(x))
t(x.na) %*% x.na

You can supply this as is or its lower triangle as vector to the
plotting function.

Cheers,

Peter



On Fri, Jan 21, 2011 at 8:25 AM, Maria Dulce Subida
 wrote:
> Hi Luciano,
>
> Thank you for your suggestion. However I'm afraid it will not work,
> since the actual number of observations used by cor.test() is not the
> total number of observations (nrow(x))  I have.
> cor.test() omits observations with NAs in x or in y, which do actually
> occur in my data sets.
>
> Kind regards,
>
> Dulce
>
>
> El 21/01/2011 16:13, Luciano Selzer escribió:
>> Hi, maybe something like this should do the trick:
>>           n <- nrow(x)
>>
>>             txt <- paste("rho=", txt, "\np=",pval, "n=", n, sep="")
>>
>>
>> HTH
>> Luciano
>>
>>
>> 2011/1/21 Maria Dulce Subida > >
>>
>>     Dear all,
>>
>>     I have a script (please see below) that gives a scatter plot matrix on
>>     the upper panel, and the spearman rho and probability values on the
>>     lower panel. It also gives the density function of each variable
>>     in the
>>     diagonal, adds a smoother and a linear regression line to each scatter
>>     plot, and puts significant spearman coefficient values in red.
>>     Now I would like to add to the lower panel information, the value of
>>     n=number of observations used to calculate the spearman correlation
>>     coefficient. Does anyone know I could I add that parameter to my
>>     function panel.cor?
>>
>>     Thank you!
>>
>>     Kind regards,
>>
>>     Dulce
>>
>>     
>>
>>     panel.cor <- function (x, y,method="spearman",digits=2,...)
>>     {
>>             points(x,y,type="n");
>>             usr <- par("usr"); on.exit(par(usr))
>>             par(usr = c(0, 1, 0, 1));
>>             correl <- cor.test(x, y,method=method);
>>             r=correl$estimate;
>>             pval=correl$p.value;
>>             color="black";
>>             if (pval<0.05) color="red";
>>             txt <- format(r,digits=2)
>>             pval <- format(pval,digits=2)
>>
>>            n <- nrow(x)
>>
>>             txt <- paste("rho=", txt, "\np=",pval, "n=", n, sep="")
>>             text(0.5, 0.5, txt,col=color)
>>     }
>>
>>     scatterplot.matrix (~ var1 + var2 + var3 , data=mydata,
>>                                     main="Mydata", smooth=TRUE,
>>     lower.panel=panel.cor, pch=20, cex=0.5,
>>     col=c("red","black"),cex.labels=1, font.labels=2, lwd=0.5)
>>
>>
>>
>>
>>
>>            [[alternative HTML version deleted]]
>>
>>     ___
>>     R-sig-ecology mailing list
>>     R-sig-ecology@r-project.org 
>>     https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>>
>
>        [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Function twostagechao {vegan}

2011-03-22 Thread Peter Solymos
Dear Diogo,

Thanks for self-correction, I reply for clarification.

The twostagechao function was in the developmental version of vegan
but got removed on 2009-12-16 07:12:59 +0100 (Wed, 16 Dec 2009) at
revision 1083 along its documentation. It never went into vegan stable
release. The function was dropped due to ambiguous interpretation of
double summation in Eqn 12b of Chao et al. 2008 Biometrics 64,
1178--1186. Thus the developmental version of the function was
probably incorrect, therefore I would warn against using it even if
one can dig it out from R-Forge.

Best,

Peter


On Tue, Mar 22, 2011 at 8:20 AM, Diogo B. Provete  wrote:
> Hi,
> I'm sorry, I've just read the documentation of the package and I noticed
> that the function was deleted.
>
> Thanks,
> Diogo
>
> Em 22 de março de 2011 11:17, Diogo B. Provete escreveu:
>
>> Hi all,
>> I had some problem when trying to use the function twostagechao in the
>> package vegan, R returned that the function was not found, despite it appear
>> in the documentation of the package.
>>
>> Thanks in advance
>>
>> --
>> Atenciosamente,
>> *Diogo Borges Provete*
>>
>> ==
>> Biólogo
>> Mestre em Biologia Animal (UNESP)
>> Doutorando PPG Ecologia e Evolução
>> Laboratório de Ecologia de Insetos (sl. 222)
>> Departamento de Ecologia
>> Instituto de Ciências Biológicas - ICB 1
>> Universidade Federal de Goiás, campus II - UFG
>> Goiânia-GO
>> CP: 131
>> 74001-970
>> Brazil
>>
>> Tel. Lab. +55 62 3521-1732
>> *Cel. +55 *62 8231-5775
>> *
>> *: diogoprovete
>> **: diogop...@yahoo.com.br
>>
>> *Personal web page *
>>
>> Traduza conosco:
>> 
>> <<> linguisticas
>> >>>
>> <<>> 
>> 
>> ==
>>
>>
>
>
> --
> Atenciosamente,
> *Diogo Borges Provete*
>
> ==
> Biólogo
> Mestre em Biologia Animal (UNESP)
> Doutorando PPG Ecologia e Evolução
> Laboratório de Ecologia de Insetos (sl. 222)
> Departamento de Ecologia
> Instituto de Ciências Biológicas - ICB 1
> Universidade Federal de Goiás, campus II - UFG
> Goiânia-GO
> CP: 131
> 74001-970
> Brazil
>
> Tel. Lab. +55 62 3521-1732
> *Cel. +55 *62 8231-5775
> *
> *: diogoprovete
> **: diogop...@yahoo.com.br
>
> *Personal web page *
>
> Traduza conosco:
> 
> << linguisticas

> <<>>
> 
> ==
>
>        [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Specifying a lognormal distribution

2011-03-26 Thread Peter Solymos
Hi Scott,
Lognormal refers to a probability distribution of a random variable
whose logarithm is normally distributed. Said so, if you log transform
your CV, you can apply Gaussian family, or simply lm().
Cheers,
Peter


On Sat, Mar 26, 2011 at 9:16 AM, Scott Chamberlain
 wrote:
> Dear sigecos,
>
>
> I am trying to analyze an ANOVA model where both predictors are categorical, 
> and the response variable is coefficient of variation. CV fits best a 
> lognormal distribution. Is there a way to specify a lognormal distribution of 
> the response variable? The glm function allows many distributions in the 
> family argument, but not for lognormal. Does anyone know of user defined 
> functions for lognormal perhaps?
>
>
> Thanks, Scott Chamberlain
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] zero-truncated Poisson

2011-04-09 Thread Peter Solymos
Jeff,

The zero truncated Poisson distribution can be described as a
probability function conditional on Y>0:

Pr(Y=y|Y>0) = Pr(Y=y) / (1-Pr(Y=0)), y=1,2,3,...

This leads to this function 'nll' for the negative log-likelihood:

nll <- function(theta, y, X) {
lambda <- exp(drop(X %*% theta))
-sum(dpois(y, lambda, log=TRUE) - log(1-exp(-lambda)))
}

where theta is the vector of parameters, y is the observation vector,
X is the design matrix.

A simple simulation is useful to check that the theory works. I used
'optim' with the default Nelder-Mead simplex algorithm, initial values
are based in non-truncated Poisson estimates to speed up convergence:

theta <- c(1, 2)
X <- model.matrix(~rnorm(100))
y <- rpois(100,exp(X %*% theta))
X <- X[y>0,]
y <- y[y>0]
inits <- coef(glm(y ~ X-1, family=poisson))
res <- optim(par=inits, fn=nll, y=y, X=X)
cbind(theta=theta, theta.hat=res$par)

Here I used the data you provided, note that 'hessian=TRUE' for some
subsequent SE calculations:

inits <- coef(glm(taken ~ mass + year, blja, family=poisson))
y <- blja$taken
X <- model.matrix(~ mass + year, blja)
res <- optim(par=inits, fn=nll, y=y, X=X, hessian=TRUE)

Vector of parameters that minimize 'nll' (maximize the log-likelihood)
given the data:

cf <- res$par

Standard error (note that due to minimization, I used the Hessian
instead of -H):

se <- sqrt(diag(solve(res$hessian)))

z and p values and the final output table:

zstat <- cf/se
pval <- 2 * pnorm(-abs(zstat))
out <- cbind(cf, se, zstat, pval)
colnames(out) <- c("Estimate", "Std. Error", "z value", "Pr(>|z|)")

printCoefmat(out, digits = 4, signif.legend = FALSE)

For your data set, this gives:

Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.763920.10659  -7.167 7.66e-13 ***
mass 0.240080.01262  19.028  < 2e-16 ***
yeartwo -0.290060.11974  -2.423   0.0154 *

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")
http://www.abmi.ca
http://www.borealbirds.ca
http://sites.google.com/site/psolymos



On Fri, Apr 8, 2011 at 6:40 PM, Stratford, Jeffrey
 wrote:
> Thanks Ben,
>
> That didn't do the trick. I checked all the classes
>
>    year      size  distance     taken      mass
>  "factor"  "factor" "numeric" "integer" "numeric"
>
> And used the notation that you suggested and I did get the same error.
>
> Just so it's not such a black box for me, what makes vector GLMs
> different from "ordinary" GLMs?
>
> Thanks,
>
> Jeff
>
> Here's all the data.
>
> structure(list(year = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L,

Re: [R-sig-eco] Multiple count if style "queries"

2011-05-20 Thread Peter Solymos
Chris,

Something like this should do the job:

1.
rowSums(xtabs(~ ECO_NAME + Genus, x) > 0)
2.
rowSums(xtabs(~ ECO_NAME + Genus, x))

Cheers,

Peter



On Fri, May 20, 2011 at 3:19 PM, Chris Mcowen  wrote:
> Dear List,
>
> I am looking to calculate two things from my data frame and was after some 
> advice. For the example below i want to know.
>
> 1. How many unique Orders/Families and Genera there are per eco-name
>
> 2. How many incidences are there for each Order/Family and Genus there are 
> per eco-region
>
> I have 650 econame.
>
> I.e for Biak-Numfoor rain forests there are 2 orders, 2 families and two 
> genera. Also, Alismatales are represented once, Asparagales once etc etc.
>
> Thanks for any advice.
>
> Chris
>
>
> ECO_NAME                                                        Order         
>   Family                                  Genus
> Biak-Numfoor rain forests                               Alismatales           
>   Araceae                         Homalomena
> Biak-Numfoor rain forests                               Asparagales           
>   Orchidaceae                     Bromheadia
> Central Range montane rain forests      Alismatales             Araceae       
>                   Homalomena
> Central Range montane rain forests      Asparagales             Lomandraceae  
>           Cordyline
> Central Range montane rain forests      Asparagales             Orchidaceae   
>                   Thelymitra
> Central Range montane rain forests      Asparagales             Orchidaceae   
>                   Dendrobium
> Central Range montane rain forests      Asparagales             Orchidaceae   
>                   Vanda
> Central Range montane rain forests      Asparagales             Orchidaceae   
>                   Bulbophyllum
> Central Range montane rain forests      Asparagales             Orchidaceae   
>                   Bulbophyllum
> Central Range montane rain forests      Asparagales             Orchidaceae   
>                   Dendrobium
> Central Range montane rain forests      Asparagales             Orchidaceae   
>                   Dendrobium
> Central Range montane rain forests      Commelinales    Commelinaceae         
>   Murdannia
> Central Range montane rain forests      Poales                  
> Centrolepidaceae        Centrolepis
> Central Range montane rain forests      Poales                  Cyperaceae    
>                   Machaerina
> Central Range montane rain forests      Poales                  Cyperaceae    
>                   Eleocharis
> Central Range montane rain forests      Poales                  Eriocaulaceae 
>           Eriocaulon
> Central Range montane rain forests      Poales                  Poaceae       
>           Schizostachyum
> Central Range montane rain forests      Poales                  Poaceae       
>           Poa
> Central Range montane rain forests      Zingiberales            Zingiberaceae 
>           Alpinia
> Central Range montane rain forests      Zingiberales            Zingiberaceae 
>           Curcuma
> Central Range montane rain forests      Zingiberales            Zingiberaceae 
>           Amomum
> Huon Peninsula montane rain forests     Asparagales             Orchidaceae   
>                   Taeniophyllum
> Huon Peninsula montane rain forests     Asparagales             Orchidaceae   
>                   Corybas
> Huon Peninsula montane rain forests     Asparagales             Orchidaceae   
>                   Thelymitra
> Huon Peninsula montane rain forests     Asparagales             Orchidaceae   
>                   Glomera
> Huon Peninsula montane rain forests     Poales                  
> Centrolepidaceae        Centrolepis
> Huon Peninsula montane rain forests     Poales                  Poaceae       
>           Poa
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Multiple count if style "queries"

2011-05-20 Thread Peter Solymos
On Fri, May 20, 2011 at 4:04 PM, Chris Mcowen  wrote:
> Thanks Peter,
>
> This is what i am after but i would like this ( the counts) for each ECO_NAME 
> not the total for the entire data set.
>
> A sample of my data is below
>
> data2 <- structure(list(ECO_NAME = structure(c(1L, 1L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
> 3L, 3L, 3L, 3L, 3L), .Label = c("Biak-Numfoor rain forests",
> "Central Range montane rain forests", "Huon Peninsula montane rain forests"
> ), class = "factor"), Order = structure(c(1L, 2L, 1L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L,
> 2L, 2L, 2L, 2L, 4L, 4L), .Label = c("Alismatales", "Asparagales",
> "Commelinales", "Poales", "Zingiberales"), class = "factor"),
>    Family = structure(c(1L, 7L, 1L, 6L, 7L, 7L, 7L, 7L, 7L,
>    7L, 7L, 3L, 2L, 4L, 4L, 5L, 8L, 8L, 9L, 9L, 9L, 7L, 7L, 7L,
>    7L, 2L, 8L), .Label = c("Araceae", "Centrolepidaceae", "Commelinaceae",
>    "Cyperaceae", "Eriocaulaceae", "Lomandraceae", "Orchidaceae",
>    "Poaceae", "Zingiberaceae"), class = "factor"), Genus = structure(c(13L,
>    3L, 13L, 6L, 19L, 9L, 20L, 4L, 4L, 9L, 9L, 15L, 5L, 14L,
>    10L, 11L, 17L, 16L, 1L, 8L, 2L, 18L, 7L, 19L, 12L, 5L, 16L
>    ), .Label = c("Alpinia", "Amomum", "Bromheadia", "Bulbophyllum",
>    "Centrolepis", "Cordyline", "Corybas", "Curcuma", "Dendrobium",
>    "Eleocharis", "Eriocaulon", "Glomera", "Homalomena", "Machaerina",
>    "Murdannia", "Poa", "Schizostachyum", "Taeniophyllum", "Thelymitra",
>    "Vanda"), class = "factor")), .Names = c("ECO_NAME", "Order",
> "Family", "Genus"), class = "data.frame", row.names = c(NA, -27L
> ))

Chris,

How about this? (Note that summed incidences are the same for all
taxonomic levels.)

> data.frame(unique.Genus=rowSums(xtabs(~ ECO_NAME + Genus, data2) > 0),
+ unique.Family=rowSums(xtabs(~ ECO_NAME + Family, data2) > 0),
+ unique.Order=rowSums(xtabs(~ ECO_NAME + Order, data2) > 0),
+ incidences.Genus=rowSums(xtabs(~ ECO_NAME + Genus, data2)),
+ incidences.Family=rowSums(xtabs(~ ECO_NAME + Family, data2)),
+ incidences.Order=rowSums(xtabs(~ ECO_NAME + Order, data2)))
unique.Genus unique.Family unique.Order
Biak-Numfoor rain forests  2 22
Central Range montane rain forests16 95
Huon Peninsula montane rain forests6     3    2
incidences.Genus incidences.Family
Biak-Numfoor rain forests  2 2
Central Range montane rain forests1919
Huon Peninsula montane rain forests6 6
incidences.Order
Biak-Numfoor rain forests  2
Central Range montane rain forests19
Huon Peninsula montane rain forests6

Peter

>
>
> On 20 May 2011, at 22:32, Peter Solymos wrote:
>
> Chris,
>
> Something like this should do the job:
>
> 1.
> rowSums(xtabs(~ ECO_NAME + Genus, x) > 0)
> 2.
> rowSums(xtabs(~ ECO_NAME + Genus, x))
>
> Cheers,
>
> Peter
>
>
>
> On Fri, May 20, 2011 at 3:19 PM, Chris Mcowen  wrote:
>> Dear List,
>>
>> I am looking to calculate two things from my data frame and was after some 
>> advice. For the example below i want to know.
>>
>> 1. How many unique Orders/Families and Genera there are per eco-name
>>
>> 2. How many incidences are there for each Order/Family and Genus there are 
>> per eco-region
>>
>> I have 650 econame.
>>
>> I.e for Biak-Numfoor rain forests there are 2 orders, 2 families and two 
>> genera. Also, Alismatales are represented once, Asparagales once etc etc.
>>
>> Thanks for any advice.
>>
>> Chris
>>
>>
>> ECO_NAME                                                        Order        
>>    Family                                  Genus
>> Biak-Numfoor rain forests                               Alismatales          
>>    Araceae                         Homalomena
>> Biak-Numfoor rain forests                               Asparagales          
>>    Orchidaceae                     Bromheadia
>&g

Re: [R-sig-eco] Help with distribution fitting and AIC in R

2011-08-06 Thread Peter Solymos
On Sat, Aug 6, 2011 at 4:06 AM, Lene Jung  wrote:
> HI,
>
> I’m having several problems trying to fit distributions to data that I have
> sorted into a data frame, so the each ID has its own step length and turn
> angle.
>
>
>
> I can fit a Weibull distribution to step lengths with following code:
>
>
>
> ###create data frame###
>
> ID1 <- na.omit(ID)
>
> weib.test = data.frame(ID1, step1)
>
> set.seed(144)
>
>
>
> weib.dist<-rweibull(1,shape=3,scale=8)
>
> names(weib.test)<-c('ID','steplength')
>
>
>
> g <- function(step1) {
>
>     require(MASS)
>
>     est <- fitdistr(step1, 'weibull')$estimate
>
>                AIC(est)
>
>     list(shape = est[1], scale = est[2])
>
>
>
>  }
>
> library(data.table)
>
> weib.test.dt <- data.table(weib.test, key = 'ID')
>
> weib.test.dt[, g(steplength), by = ID1]
>
>
>
> This gives me the following output:
>
>
>
>  ID1     shape    scale
>
> [1,]   1 0.8089580 131.1514
>
> [2,]   2 0.8393755 106.2641
>
> [3,]   3 0.8421661 196.7409
>
> [4,]   4 0.9096443 192.8869
>
> [5,]   5 0.8170656 172.3526
>
> [6,]   6 0.9815219 136.5133
>
> [7,]   7 0.9485611 133.9226
>
> [8,]   8 0.9279664 158.5760
>
> [9,]   9 0.7980447 250.8815
>
> [10,]  10 0.7882116 132.3324
>
> [11,]  11 0.6684602 176.5392
>
> [12,]  12 0.7636310 105.1567
>
> [13,]  13 0.7179191 126.0132
>
> [14,]  14 0.6833156 183.6390
>
> [15,]  15 0.9356103 171.3479
>
> [16,]  16 0.7865950 129.0388
>
> [17,]  17 0.8408600 157.9560
>
> [18,]  18 0.8023891 127.5567
>
>
>
> Which is exactly what I need – separate shape and scale parameters for each
> ID. However I also want to get AIC values for each ID, and I have tried
> several things such as
>
> AIC(weib.test.dt, by = ID1)
>
> or
>
> extractAIC(fit, scale, k = 2, ...)AIC(weib.test.dt)
>
> stopifnot(all.equal(AIC(weib.test.dt),
>
>                    AIC(logLik(weib.test.dt
>
>
>
>
>
> But I keep getting an eror:  Error in UseMethod("logLik") :
>
>  no applicable method for 'logLik' applied to an object of class
> "c('data.table', 'data.frame')"
>
>
>
> I’m assuming this had to do with the multiple ID’s? Should I just fit the
> distributions separately for each ID instead and then run AIC?
>

Lene,

You need to return the fitted object 'est', the error message tells
you that data.frame object class does not come with a logLik method,
unlike a fitdistr class returned by the function fitdistr(). You can
then use AIC, which assumes there is a logLik method with
log-likelihood and number of parameters defined. Alternatively, you
can return c(est$estimate, AIC(est)) that gives you a data frame with
AIC in 3rd column.

Careful reading of the help page for fitdistr would tell you the
structure of the return object.

>
>
> Furthermore, I would like to do the same thing as above with turn angles –
> but by fitting a wrapped Cauchy distribution to the several IDs. I have
> tried following code:
>
>
>
> ###wrapped cauchy fitting, simple mean and rho###
>
> mu0<- circ.mean(turn1)
>
> rho0 <- est.rho(turn1)
>
>
>
> wcauchy.test = data.frame(ID2, turn1)
>
> set.seed(144)
>
>
>
> wcauchy.dist<-rweibull(1,mu0,rho0)
>
> names(wcauchy.test)<-c('ID','turnangle')
>
>
>
> z <- function(turn1) {
>
>     require(MASS)
>
>     est <-wrpcauchy.ml(turn1, mu0, rho0, acc=1e-015)$estimate
>
>     list(mu0 = est[1], rho0 = est[2])
>
>  }
>
>
>
> library(data.table)
>
> wcauchy.test.dt <- data.table(wcauchy.test, key = 'ID')
>
>
>
> wcauchy.test.dt[, z(turnangle), by = ID2]
>
>
>
> But I keep getting the error: Error in `[.data.table`(wcauchy.test.dt, ,
> z(turnangle), by = ID2) :
>
>  Type 0 in j column
>
>
>
> And of course I would like to calculate the AIC as well for distribution
> fit.
>
>
>
> Can anyone help me out with this?
>

There is no wrpcauchy.ml function in MASS, but in CircStats. It
returns only the estimates, and no logLik method is defined. From the
code or from the references, it should be possible to figure out PDF
for the wrapped Cauchy distribution, and from that it is easy to
calculate AIC.

Again, this is obvious from the help page.

Cheers,

Peter

>
>
> Thanks,
>
> Lene Jung Kjaer
>
>
>
> ___
>
> Lene Jung Kjær
>
> Studsdalvej 20, Taulov
>
> 7000 Fredericia, Danmark
>
> Tlf: 29 86 96 14
>
> email: ljk...@hotmail.com
>
>
>
>
>        [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Calculating an AIC for the Fisher's log series.....

2011-10-05 Thread Peter Solymos
Cory,

There is a profile method for fisherfit. If that's not enough, you can
have a look inside of fisherfit, and you'll see the Dev.logseries
internal function and how it is used in nlm to get estimate of alpha.
The same function can be used for manual profiling to get likelihood
values for a sequence of alphas. Note that Dev.logseries returns minus
log-likelihood. You get AIC as -2*logLik+2*p, or just define a logLik
method for fisherfit as

logLik.fisherfit <- function(object, ...) {
ll <- -object$minimum
attr(ll, "df") <- length(object$estimate)
attr(ll, "nobs") <- length(object$fisher)
class(ll) <- "logLik"
ll
}

This way, AIC, BIC, etc will work naturally:

data(BCI)
mod <- fisherfit(BCI[5,])
logLik(mod)
AIC(mod)
BIC(mod)

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")
http://www.abmi.ca
http://www.borealbirds.ca
http://sites.google.com/site/psolymos



On Wed, Oct 5, 2011 at 11:42 AM, Cory Redman  wrote:
> I am trying to calculate an AIC for my Fisher's log series (fisherfit).  In
> order to do this I need the log likelihood.  Does anyone know how to
> calculate the log likelihood for a specific alpha value.  I know that the
> second derivative of the log-likelihood is used to calculate the standard
> error of alpha, but does anyone know how to retrieve these values (Oksanen,
> Vegan: ecological diversity).
>
> Cheers,
>     Cory
>
> --
> __
> Cory M. Redman
> Doctoral Candidate
> Texas A&M University
> Department of Geology & Geophysics
> 3115 TAMU
> College Station, TX  77843-3115
>
> Your point of view of the world depends greatly on where you stand and what
> you had to do to get there.
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] How can I combine several workspaces into one?

2012-02-10 Thread Peter Solymos
Guojun,

Make lists of them with this function:

fun <- function(file) {
x <- new.env()
load(file, x)
as.list(x)
}

HTH,

Peter



On Fri, Feb 10, 2012 at 1:52 PM, lgj200306  wrote:
> Hi, everyone!
> I have several R workspaces, with each contains many files. I want to combine 
> these workspaces into one. Which command should I use to achieve my target?
> Thank you very much and best wishes for you!
>
> Guojun Lin
> Feb. 10th, 2012
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Model for zero-inflated species abundance data, allowing for spatial autocorrelation?

2012-02-15 Thread Peter Solymos
Kay and Alexandre,

INLA approach might be fine, but given the data you described, I would
rather think about what might cause the zero inflation (90% zeros) and
the spatial autocorrelation and pick a model accordingly. For example
if the zero inflation might be caused by low abundance related to low
habitat suitability, underlying occupancy/range pattern, or detection
error, or some combination of these. Depending on the cause of the
zeros, the treatment of autocorrelation might change. If you have all
nonzero counts clumped together, it might be explained by a spatial
covariate effect. If not, 10% nonzero count in the range of 1-6 won't
necessarily allow a meaningful estimation of spatial autocorrelation
(of course depending on sample size). On the other hand, if you have
nonzero counts scattered in the landscape, estimating autocorrelation
again will be difficult as it roughly decreases at a rho^distance
rate.

Detection error is a whole other can of worms, but using single visit
(without temporal replication) to sites actually allow to correct for
detection error as long as covariates are available (see svabu
function in detect package and references on svabu help page, this
implements ZIP model with detection error and without spatial
autocorrelation).

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")
http://www.abmi.ca
http://www.borealbirds.ca
http://sites.google.com/site/psolymos



On Tue, Feb 14, 2012 at 5:25 PM, Kay Cichini  wrote:
> Alexandre, many thanks, there seems to be plenty of useful content on the
> linked site!
>
> Am 14.02.2012 14:56 schrieb "Alexandre Villers" >:
>>
>> I answered by mistake on the R SIG GEO sorry for cross posting.
>>
>> -
>>
>> Hi,
>>
>>
>> If you are not afraid to switch to Bayesian methods (but fast ones !),
> the INLA package could do that for you. ( www.r-inla.org )
>> Cheers
>>
>>
>> Alex
>>
>> On 14/02/2012 15:36, Kay Cichini wrote:
>>>
>>> Hi all,
>>>
>>> I'd very much appreciate pointers at methodology/R-packages/functions
>>> suited for modelling zero inflated species abundance data (counts) that
>>> most likely will show spatial autocorrelation. Species abundances range
>>> between 0 and 6, the proportion of zeroes is about 90%. I have about 6
>>> nominal covariables (partly heavily unbalanced).
>>>
>>> Yours,
>>> Kay
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ___
>>> R-sig-ecology mailing list
>>> R-sig-ecology@r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>>
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Using vegan with a large data set

2012-03-05 Thread Peter Solymos
Bier,

Solutions might depend on OS and 32/64 bit build that you are using.
For general info, have a look at R FAQ:
http://cran.r-project.org/bin/windows/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021
or read help("Memory-limits")

7000*50 is usually not considered big data nowadays, but the
calculations can eat up the memory regardless. If brute force memory
allocation does not solve your problem, you can randomly take subsets
of the data and combine the estimates in the end.

HTH,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")
http://www.abmi.ca
http://www.borealbirds.ca
http://sites.google.com/site/psolymos



On Mon, Mar 5, 2012 at 4:42 PM, Bier Ekaphan Kraichak
 wrote:
> Dear list,
>
> I'm trying to perform adonis and betadisper functions on a relative large
> data set (7000 sites x 50 spp.) and found that R cannot allocate that much
> memory to this task (I got an error along the line of "Cannot allocate
> vector of xx Mb"). I have tried to use some of the high performance
> computing packages mentioned on CRAN tasks page (e.g. bigmemory), but it
> seems like functions in Vegan can't really take objects produced from this
> kind of packages. Any suggestions on how to proceed with this type of
> analysis?
>
> Thank you,
> Bier Kraichak
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] ctree regression tree, interpretation of mean predicted values in terminal nodes

2012-03-27 Thread Peter Solymos
Kay,

That is an obvious result of the regression tree algorithm which
recursively splits the data and prediction is given as e.g. mean of
observations at terminal nodes. New data will, however, contribute to
cross validation error, a measure of prediction accuracy. The tree
gives the 'global' model, and within each terminal node a 'local'
model can be fit. The simplest 'local' model is the piecewise-constant
model that is the mean. It's like looking for finest homogeneous
stratum where the response is constant. There might be differences in
how a split is chosen, how the tree is pruned, when to stop (rpart,
ctree, etc), but same terminal groping leads to same fitted values.

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")
http://www.abmi.ca
http://www.borealbirds.ca
http://sites.google.com/site/psolymos



On Tue, Mar 27, 2012 at 2:42 PM, Kay Cichini  wrote:
> I can't grasp how it can be that the mean prediction at terminal nodes
> perfectly fit the true mean values of the observed variable at the terminal
> nodes -
> I'm afraid I'm missing something completely obviuos here:
>
> # make a regression tree:
> rt <- ctree(Ozone ~ ., data = airq)
>
> # Validate:
> Prediction <- unlist(treeresponse(rt))
> (Val <- data.frame(Node = rt@where,
>                   Prediction, True = airq$Ozone))
>
> # compare mean prediction per node
> # with observed mean values per node:
> options(scipen = 999)
> cbind(aggregate(True ~ Node, FUN = mean, data = Val),
>      Pred = aggregate(Prediction ~ Node, FUN = mean, data = Val)[, 2])
>
> # also, plot predictions vs. true values:
> plot(Val$Prediction, Val$True)
> coef <- coef(lm(Val$Prediction ~ Val$True))
> abline(c(0, coef[1]), c(1, coef[2]))
> myseq <- seq(0, 75, 25)
> abline(v = myseq, h = myseq)
>
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Mantel

2012-04-19 Thread Peter Solymos
Jonathan and Chris,

The mantel function in vegan package contains dist-to-matrix coercion,
so memory requirements for matrices should be setting the limit.

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")
http://www.abmi.ca
http://www.borealbirds.ca
http://sites.google.com/site/psolymos



On Thu, Apr 19, 2012 at 7:12 PM, Chris Howden
 wrote:
> I can't comment on vegan but R in general can handle a matrix with about
> 2*10^9 elements (for more R memory info look at
> http://stat.ethz.ch/R-manual/R-devel/library/base/html/Memory-limits.html)
>
> I believe distance matrices usually only store either the lower or upper
> diagonal. So the number of elements in a distance matrix are approx 1/2
> the number of elements in a matrix.
>
> I use the following code to see if my data is too big for R.
>
> ## # CHECK: IS DISTANCE MATRIX TOO BIG FOR MEMORY?
> ## # What is the min memory required for the distance matrix, assuming 1
> bit for each distance (which is actually too small)
> ## # CHECK RESULT GB: requires 8GB which is too big for my computer
> ## dim(segment.input)
> ## (nrow(segment.input)^2)/2
> ## (nrow(segment.input)^2)/(2*10)
>
> ## # CHECK RESULT vector length, keeping in mind that I think distance
> matrices only require half the matrix to
> ## # store all the data, max is 2*10^9: Its bigger
> ## (nrow(segment.input)^2)/2 - 2*10^9
>
>
> ## ## interstingly nrow(segment.input)*nrow(segment.input) won't work for
> large numbers we get
> ## ## > nrows*nrows
> ## ## [1] NA
> ## ## Warning message:
> ## ## In nrows * nrows : NAs produced by integer overflow
>
>
> Chris Howden B.Sc. (Hons) GStat.
> Founding Partner
> Evidence Based Strategic Development, IP Commercialisation and Innovation,
> Data Analysis, Modelling and Training
> (mobile) 0410 689 945
> (fax) +612 4782 9023
> ch...@trickysolutions.com.au
>
>
>
>
> Disclaimer: The information in this email and any attachments to it are
> confidential and may contain legally privileged information. If you are
> not the named or intended recipient, please delete this communication and
> contact us immediately. Please note you are not authorised to copy, use or
> disclose this communication or any attachments without our consent.
> Although this email has been checked by anti-virus software, there is a
> risk that email messages may be corrupted or infected by viruses or other
> interferences. No responsibility is accepted for such interference. Unless
> expressly stated, the views of the writer are not those of the company.
> Tricky Solutions always does our best to provide accurate forecasts and
> analyses based on the data supplied, however it is possible that some
> important predictors were not included in the data sent to us. Information
> provided by us should not be solely relied upon when making decisions and
> clients should use their own judgement.
>
>
> -Original Message-
> From: r-sig-ecology-boun...@r-project.org
> [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Jonathan Hughes
> Sent: Friday, 20 April 2012 10:24 AM
> To: r-sig-ecology@r-project.org
> Subject: [R-sig-eco] Mantel
>
>
>
> Dear all,
> Does anyone have an expectation of the maximum distance matrix size that
> vegan can handle during a partial Mantel test?
> thanks,
> Jonathan
>        [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Presence-absence to Occurrence Dataset?

2012-04-20 Thread Peter Solymos
Hi Elyse,

Tom referred to the mefa package, I'd like to draw the attention to
the mefa4 package which is more efficient and can handle large data
since it uses sparse matrices through the Matrix package. The Melt()
function is the inverse operation to xtabs() or its modified version
Xtabs() in mefa4. It can take a matrix, sparse matrix, S3 mefa or S4
Mefa object as its argument:

library(mefa4)

y <- matrix(sample(0:4, 12, replace=TRUE, prob=c(0.5, rep(0.5/4, 4))), 4, 3)
x <- data.frame(x=runif(4), y=runif(4))
colnames(y) <- LETTERS[1:3]
rownames(y) <- rownames(x) <- paste("row", 1:4, sep="_")

z <- Melt(y)
data.frame(z, x[z$rows,])

There is also the (relatively) new multitable package by Steve Walker
with similar functionality. Lots of ways to do it -- the problem must
bother many of us :)

Cheers,

Peter

Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
CW 405, Biological Sciences Bldg
University of Alberta
Edmonton, Alberta, T6G 2E9, Canada
Phone: 780.492.8534
Fax: 780.492.7635
email <- paste("solymos", "ualberta.ca", sep = "@")
http://www.abmi.ca
http://www.borealbirds.ca
http://sites.google.com/site/psolymos



On Fri, Apr 20, 2012 at 1:06 PM,   wrote:
> Elyse--
>
> For the first step of converting your community matrix to triplets of
> {site, species, presence}, you can use melt() from the flexible & powerful
> reshape package as David suggested, or dematrify() from Dave Roberts'
> labdsv package, which was built specifically for this task and has simpler
> syntax than melt.  The second step is to then merge your location dataframe
> with the long-format occurrence data by sitenames.
>
> Do you need coordinates for only presences, or for presences and absences?
> If you need only presences, dematrify() with thresh=0.5 would automatically
> drop absences, while melt() would require a second step LongOccurrences <-
> LongOccurrences[Freq>0,]
>
> So, (untested code) for dataframes named CommMatrix and SiteCoords and
> sitenames as Site:
> LongOccurrences <- dematrify(CommMatrix,thresh=0.5)
> Presences <- merge(LongOccurrences,SiteCoords,by="Site",all.x=TRUE)
>
> If you have more complex needs for handling community and environmental
> data, you might consider the mefa package.
>
> Tom
> ---
> Tom Philippi
> Quantitative Ecologist
> Inventory and Monitoring Program
> National Park Service
> c/o Cabrillo National Monument
> 1800 Cabrillo Memorial Dr
> San Diego, CA 92106
> (619) 523-4576
> tom_phili...@nps.gov
> http://science.nature.nps.gov/im/monitor
> ---
>
>
>
>
>             David Valentim
>             Dias
>                          l.com>                    "r-sig-ecology@r-project.org"
>             Sent by:                  
>             r-sig-ecology-bou                                          cc
>             n...@r-project.or
>             g                                                     Subject
>                                       Re: [R-sig-eco] Presence-absence to
>                                       Occurrence Dataset?
>             04/20/2012 01:32
>             PM AST
>
>
>
>
>
>
>
>
> You can check melt() (from reshape2) and reshape() functions.
>
>
> 2012/4/20 Coffey, Elyse D. (UMSL-Student) 
>
>> Hello everyone,
>>
>> I am working with a large community dataset where column names are
> species
>> and rows are filled with either 1s or 0s (presence absence of the
> species)
>> and each row corresponds to a site name.  I have long and lati
> coordinates
>> for each site in a separate datasheet.
>>
>> I am trying to use a program called Maxent in order model species
>> distributions. The program requires a specific occurrence dataset with
>> columns in this format: species, Long, Lati.
>>
>> It is very difficult and time consuming to create such a dataset by hand!
>> I was wondering if anyone had any ideas as to how I can use R to
> manipulate
>> my original community dataset in order to create the occurrence dataset I
>> described above?
>>
>> Any suggestions would be greatly appreciated!
>>
>> Thanks,
>>
>> Elyse Coffey
>> Biology (B.S.) student
>> University of Missouri-St. Louis
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>
>
>
> --
> Currículo: http://lattes.cnpq.br/7541377569511492
>
>             [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R

Re: [R-sig-eco] Combining two matrices

2012-05-13 Thread Peter Solymos
Thiago,

It's all about indexing:

rn <- union(rownames(matrix1), rownames(matrix2))
cn <- union(colnames(matrix1), colnames(matrix2))
x <- array(0, dim=c(length(rn), length(cn)), dimnames=list(rn, cn))
x[rownames(matrix1), colnames(matrix1)] <- matrix1
x[rownames(matrix2), colnames(matrix2)] <- matrix2
x

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Sun, May 13, 2012 at 8:50 PM, Thiago Gonçalves-Souza
 wrote:
> Dear all,
>
> I have two species-by-site matrices and I'd like to combine them in one
> matrix. The rows (n=20) of each matrix present plants 1-20. So, the new
> matrix should have 40 rows. As these matrices have some shared species and
> some exclusive species, the new matrix should have the species from the two
> matrix, but when certain species that doesn't occur for example in Matrix
> 1, its value will be 0. I've tried the functions merge, cbind and rbind but
> I wasn't able to do it. Please, see the following example to make it clear
> ;-)
>
>
> By using for example the following matrices:
>
> matrix1 <- matrix(c(1,2,3, 11,12,13), nrow = 2, ncol=3, byrow=TRUE,
>               dimnames = list(c("plant1", "plant2"),
>                               c("sp1", "sp2", "sp3")))
>
> matrix2 <- matrix(c(1,2,3, 0,21,9), nrow = 2, ncol=3, byrow=TRUE,
>               dimnames = list(c("plant3", "plant4"),
>                               c("sp1", "sp2", "sp4")))
>
> I'd like to combine them  like this:
>
> new.matrix <- matrix(c(1, 2, 3, 0, 11, 12, 13, 0, 1, 2, 0, 3, 0,21,0,9),
>              nrow = 4, ncol=4, byrow=TRUE,
>               dimnames = list(c("plant1", "plant2", "plant3", "plant4"),
>                               c("sp1", "sp2", "sp3", "sp4")))
>
> Once I have 200 columns and 80 rows, I'm pretty sure that R can help me to
> do it automatically ;-)
>
>
> All the best,
> Thiago.
>
>
> --
> Thiago Gonçalves-Souza, Ms.
> Universidade Estadual Paulista (UNESP)
> Departamento de Zoologia e Botânica
> Programa de Pós-Graduação em Biologia Animal (Doutorado)
> E-mail alternativo: thiagoara...@gmail.com
> Home page:
>
>        [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] How to transform a cross-tabulated matrix (as provided by mefa4::Xtab) for further multivariate analyses

2012-06-12 Thread Peter Solymos
Ivailo,

Some (but not all) vegan functions internally coerce the "matrix like"
input object to matrix using 'x <- as.matrix(x)'. The as.matrix()
coercion method is defined for sparse matrices in the Matrix package,
and that is why it works for some (but not all) vegan functions. In
case it does not happen I'd recommend using the 'x <- as.matrix(x)'
coercion prior to supplying the data to a vegan function.

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Tue, Jun 12, 2012 at 8:08 AM, Ivailo  wrote:
> Dear Listmembers,
>
> I have a large data-set that I try to manage (gracefully) according to
> the mefa framework, but have noticed that cross-tabulated matrices
> obtained using mefa4::Xtab can be directly submitted to some of the
> multivariate functions in vegan but not to others.
>
> Here is an example:
>
> # example data
> #
> x <- data.frame(
>  sample = paste("Sample", c(1,1,2,2,3,4), sep="."),
>  species = c(paste("Species", c(1,1,1,2,3), sep="."),  "zero.pseudo"),
>  count = c(1,2,10,3,4,0))
>
> # create a cross-tabulated matrix; drop empty columns and rows to be
> able to dca()
> #
> x1 <- Xtab(count ~ sample + species, x, cdrop=T, rdrop=T)
>
> # now perform some analyses
> #
> dca <- decorana(x1)
> # works OK
>
> vegemite(x1, dca, "Hill", zero="-")
> ##Error in colSums(x[site.ind, ]) :
> ##  'x' must be an array of at least two dimensions
>
> # Trying to make 'x1' a data-frame does not work either
> #
> as.data.frame(x1)
> ##Error in as.data.frame.default(x1) :
> ## cannot coerce class 'structure("dgCMatrix", package = "Matrix")'
> into a data.frame
>
> What would be the correct/recommended way of transforming such
> matrices for use in subsequent multivariate analyses?
>
> Any hints would be much appreciated.
>
> Cheers,
> Ivailo
> --
> UBUNTU: a person is a person through other persons.
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Carrying capacity in R

2012-06-28 Thread Peter Solymos
Manuel,

As a starter you can fit nonlinear models using growth functions, and
calculate carrying capacity from estimated model parameters
(stats:::nle, lme4:::nlmer).

I am not sure if this is of big help as it is still in development,
but our PVAClone R package is almost ready for its 1st CRAN submission
and can be found at the developmental R-Forge site:
https://r-forge.r-project.org/projects/dcr/

The PVAClone package offers likelihood based population viability
analysis in the presence of observation error and missing data.
Currently it can be used to fit and compare support for models based
on Gompertz and Ricker growth models with no observation error, Normal
or Poisson observation error as described in these papers (other
growth models and forecasting is soon to be added):

Ponciano, J. M. et al. 2009. Hierarchical models in ecology:
confidence intervals, hypothesis testing, and model selection using
data cloning. Ecology 90, 356-362.

Nadeem, K., Lele S. R., 2012.  Likelihood based population viability
analysis in the presence of observation error. Oikos. doi:
10./j.1600-0706.2011.20010.x

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


2012/6/28 Manuel Spínola :
> Dear list members,
>
> Is there any package (and function) to assess carrying capacity for
> biological population?
>
> Best,
>
> Manuel
>
> --
> *Manuel Spínola, Ph.D.*
> Instituto Internacional en Conservación y Manejo de Vida Silvestre
> Universidad Nacional
> Apartado 1350-3000
> Heredia
> COSTA RICA
> mspin...@una.ac.cr
> mspinol...@gmail.com
> Teléfono: (506) 2277-3598
> Fax: (506) 2237-7036
> Personal website: Lobito de río 
> Institutional website: ICOMVIS 
>
>        [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Re-arrange data frame

2012-06-29 Thread Peter Solymos
Manuel,

You haven't specified the general problem, but for this particular
situation this is how you can do it:

x <- data.frame(array(1:12, c(3,4), list(paste("item", 1:3),
paste("col", 1:4
x <- data.frame(Item=rownames(x), x)
y <- data.frame(Item=x$Item[rep(1:3, each=2)],
matrix(as.matrix(x[,-1]), 6, 2, byrow=TRUE))

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


2012/6/29 Manuel Spínola :
> Dear List members,
>
> How can I re-arrange a data frame as detailed below.
>
> old data frame
>
> Item      col1  col2 col3 col4
>
> item1      12    11    6      7
> item2      10    8     5      4
> item3      3      5     4      3
>
> New data frame
>
> item1      12   11
> item1       6     7
> item2      10    8
> item2       5     4
> item3      3      5
> item3      4      3
>
> Best,
>
> Manuel
>
>
> --
> *Manuel Spínola, Ph.D.*
> Instituto Internacional en Conservación y Manejo de Vida Silvestre
> Universidad Nacional
> Apartado 1350-3000
> Heredia
> COSTA RICA
> mspin...@una.ac.cr
> mspinol...@gmail.com
> Teléfono: (506) 2277-3598
> Fax: (506) 2237-7036
> Personal website: Lobito de río 
> Institutional website: ICOMVIS 
>
>        [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Randomizing matrices

2012-08-18 Thread Peter Solymos
Allan,

Simply defining the dimension might work:
dim(v) <- dim(trial2)
but it is not clear what you are trying to achieve with the rep(...,
1000) part. It won't permute the matrix 1000 times but repeat same
values. You might want to have a look at oecosimu in vegan which
calculates the distribution given you SSD statistics, and you can use
many different matrix permutation algorithms although it seems to me
that you are just shuffling the cells.

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Sat, Aug 18, 2012 at 7:05 AM, Allan Edelsparre  wrote:
> Dear R ecologists,
>
> I'm trying to figure out a way to calculate the sum of squared
> distances (SSD) between two matrices, where one matrix is held
> constant and the other is randomized. So far I have been able to get
> the syntax together to obtain my observed SSD, but the problem for me
> is obtain SSD from 1000 randomizations to obtain a distribution. Below
> is my syntax, all data were standardized a priori, each matrix is
> 6columns x 363rows.
>
> For now the issue seems to be that the 'sample' command destroys the
> matrix and turns it into a vector. I've tried to remedy this but now
> it seems I may have one huge matrix from my randomizations rather than
> 1000 of the original size.
>
> data1<-read.table("StandDataTime1and2.txt", header=TRUE)
> data1<-data.frame(data1)
>
> #split data into two matrices 6 X 363
> #trial1
> trial1<-cbind(data1$fe1, data1$fr1, data1$ar1, data1$mid1, data$bot1,
> data$surf1)
> trial1<-as.matrix(trial1)
>
> ###trial2
> trial2<-cbind(data1$fe2, data1$fr2, data1$ar2, data1$mid2, data1$bot2,
> standardizedData$surf2)
> trial2<-as.matrix(trial2)
>
> ##calculate euclidean distance between matrices
> library(fields)
> distances<-rdist(trial1, trial2)
>
> ###getting the observed value SSD
> squared<-distances %*% distances
> ssd<-sum(squared)
> ssd
> ###observed value is 14301499
>
> ###
> Below is where the problem is occuring
>
>
> permuting one block (trial2) and calculating a 95%
> distribution from 1000 randomizations
> v <- rep(sample(trial2 <- as.matrix(trial2)),1000)
> ## Now run those thousand values through the equation to calculate the ssd2
>
> for (i in v) {
>   dist2 <-rdist(trial1,[i])
>   squared2<-dist2^2
>   ssd2<-sum(squared2)
>   print(ssd2)
> }
>
> Working with matrices seems to add to the level of difficulty. Any
> help here would be greatly appreciated. Thanks in advance for any
> help.
>
> Allan
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Randomizing matrices

2012-08-19 Thread Peter Solymos
Allan, here is a toy example. Cc-ing back to SIG-ECO:

library(fields)
library(vegan)
x <- matrix(1:12,3,4)
rf <- function(x)
array(sample(x), dim=dim(x))
sf <- function(x, y) {
d <- rdist(x, y)
sum(d %*% d)
}
z <- oecosimu(x, sf, rf)
z
hist(c(z$statistic, z$oecosimu$simulated))
abline(v=z$statistic, col=2)

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Sun, Aug 19, 2012 at 5:50 AM, Allan Edelsparre  wrote:
> Hi Peter,
>
> I am still having trouble with matrices, especially the oecosimus, where I 
> got completely lost. However, as you also pointed out, all I am trying to do 
> is to shuffle the cells. Do you have any suggestions as to how I can shuffle 
> the cells a 1000 times, and then go on to make the SSD value 1000 times?
>
> Thanks again for helping out.
>
> Allan
>
>
>
> - Original Message -
> From: "Peter Solymos" 
> To: "Allan Edelsparre" 
> Cc: r-sig-ecology@r-project.org
> Sent: Saturday, August 18, 2012 10:50:46 AM
> Subject: Re: [R-sig-eco] Randomizing matrices
>
> Allan,
>
> Simply defining the dimension might work:
> dim(v) <- dim(trial2)
> but it is not clear what you are trying to achieve with the rep(...,
> 1000) part. It won't permute the matrix 1000 times but repeat same
> values. You might want to have a look at oecosimu in vegan which
> calculates the distribution given you SSD statistics, and you can use
> many different matrix permutation algorithms although it seems to me
> that you are just shuffling the cells.
>
> Cheers,
>
> Peter
>
> --
> Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
> soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
> Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
> Boreal Avian Modelling Project, http://www.borealbirds.ca
>
>
> On Sat, Aug 18, 2012 at 7:05 AM, Allan Edelsparre  
> wrote:
>> Dear R ecologists,
>>
>> I'm trying to figure out a way to calculate the sum of squared
>> distances (SSD) between two matrices, where one matrix is held
>> constant and the other is randomized. So far I have been able to get
>> the syntax together to obtain my observed SSD, but the problem for me
>> is obtain SSD from 1000 randomizations to obtain a distribution. Below
>> is my syntax, all data were standardized a priori, each matrix is
>> 6columns x 363rows.
>>
>> For now the issue seems to be that the 'sample' command destroys the
>> matrix and turns it into a vector. I've tried to remedy this but now
>> it seems I may have one huge matrix from my randomizations rather than
>> 1000 of the original size.
>>
>> data1<-read.table("StandDataTime1and2.txt", header=TRUE)
>> data1<-data.frame(data1)
>>
>> #split data into two matrices 6 X 363
>> #trial1
>> trial1<-cbind(data1$fe1, data1$fr1, data1$ar1, data1$mid1, data$bot1,
>> data$surf1)
>> trial1<-as.matrix(trial1)
>>
>> ###trial2
>> trial2<-cbind(data1$fe2, data1$fr2, data1$ar2, data1$mid2, data1$bot2,
>> standardizedData$surf2)
>> trial2<-as.matrix(trial2)
>>
>> ##calculate euclidean distance between matrices
>> library(fields)
>> distances<-rdist(trial1, trial2)
>>
>> ###getting the observed value SSD
>> squared<-distances %*% distances
>> ssd<-sum(squared)
>> ssd
>> ###observed value is 14301499
>>
>> ###
>> Below is where the problem is occuring
>>
>>
>> permuting one block (trial2) and calculating a 95%
>> distribution from 1000 randomizations
>> v <- rep(sample(trial2 <- as.matrix(trial2)),1000)
>> ## Now run those thousand values through the equation to calculate the ssd2
>>
>> for (i in v) {
>>   dist2 <-rdist(trial1,[i])
>>   squared2<-dist2^2
>>   ssd2<-sum(squared2)
>>   print(ssd2)
>> }
>>
>> Working with matrices seems to add to the level of difficulty. Any
>> help here would be greatly appreciated. Thanks in advance for any
>> help.
>>
>> Allan
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] error distribution in GLM model

2012-08-19 Thread Peter Solymos
Mario,

If you can assume that the waiting time between events is constant
through time, you can model your counts per unit time with Poisson glm
(constant waiting time leads to an exponential survival function).
log(Observation time) can be used as an offset:

glm(interactions~covariate, offset=log(obs.time), family=poisson)

Note that diagnostic plots indicate that homogeneous Poisson process
assumption might not hold.

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Sun, Aug 19, 2012 at 10:14 AM, Mario Serrajotto
 wrote:
> Dear R-users,
>
> apologies for the total beginner's question. I have some behavioural
> data expressed in number of interactions measured over a certain
> amount of time (expressed in minutes). I have calculated the number of
> interactions per minute and I would like to model this variable as a
> function of a covariate. I am, however, struggling with the choice of
> the argument family. I have thought to log transform the data and
> model them with a GLM by specifying a Gaussian family. Unfortunately
> for me my data do not seem to follow a normal distribution. I then
> recalled that my response is some kind of proportion and should
> perhaps be modelled with binomial GLM. But the data are not bounded
> from 0 to 1...Any suggestions would be greatly appreciated! Please see
> data and R code at the bottom of the e-mail.
>
> Cheers,
>
> Mario
>
> # number of interactions recorded
> interactions<-c(2,1,2,3,0,0,0,0,1,2,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,2,3,0,8,7,6,3,5,4,5,7,6,0,0,0,0,2,2,3,4,1,2,5,6,2,4,0,6,0,0,39,46,45,48,11,8,7,8,6,7,13,10,7,9,32,38,39,25,32,30,11,13)
>
> # observation time (in minutes)
> obs.time<-c(9.47,9.7,9.1,9.45,8.816667,8.93,8.716667,8.85,6,3.85,6.17,6.316667,2,2.38,7.95,9.63,9.816667,1.5,9.416667,9.35,1.28,1.85,1,2.33,6.6,6.316667,4.43,2.5,9.35,9.316667,6.6,6.316667,6.316667,6.17,6,9.516667,9.1,9.45,8.73,2.5,2.38,2,1.85,1.28,2.38,8.65,8.53,8.5,8,7.68,0.55,2.4,8.516667,8.18,7.916667,8,9.58,9.68,9.68,9.48,9.58,9.55,9.48,9.45,9.73,9.68,9.63,9.45,9.48,9.68,9.65,9.616667,9.5,9.43,9.316667,9.33,9.416667,9.47)
>
> # covariate
> covariate<-c(15,15,18,18,40,40,45,45,42,42,50,50,200,200,400,400,500,500,200,200,150,150,90,90,45,45,37,37,42,42,37,37,80,80,110,110,150,150,300,300,250,250,85,85,18,18,42,42,15,15,50,50,45,45,200,200,45,45,15,15,115,115,125,125,500,500,550,550,550,550,45,45,37,37,80,80,100,100)
>
> # calculate total number of interaction per minute
> interactions.m<-interactions/obs.time
>
> # fit glm
> glm(log(interactions.m+1)~covariate)
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] subsetting in R

2012-09-18 Thread Peter Solymos
Kristen,

Try something like this:

i <- sample(1:nrow(DataSet.Sub2), 85, replace=FALSE)
DataSet.Sub2.66<- DataSet.Sub2[i,]
DataSet.Sub2.33<- DataSet.Sub2[-i,]

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Tue, Sep 18, 2012 at 3:56 PM, Kristen Gorman  wrote:
> Dear all,
> I have a dataset where I would like to take 2/3 of the data to build a model 
> and test it on the remaining 1/3 of the dataset. In order to generate the 2/3 
> dataset I have the following code:
>
> # 85 is .66 of 129 in order to generate random number for 2/3 dataset.
> DataSet.Sub2.66<- DataSet.Sub2[sample(1:nrow(DataSet.Sub2), 85, 
> replace=FALSE),]
>
> Now, I would like to generate a separate dataset from the remaining 1/3 
> dataset not selected for the 2/3 dataset. Any ideas on how best to do this?
>
>
> Thanks for any input,
>
> Kristen
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Replacing values in a data frame

2012-10-25 Thread Peter Solymos
Maybe:
A$X2[A$X2>1] <- 1

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


2012/10/25 Manuel Spínola :
> Dear list members,
>
> I want to replace the values of a numerical variable in a data frame called
> "A".
>
> A
> X1X2
> 2   1.5
> 31
> 41.2
>
> How to replace any value greater than 1 by the value 1 in the variable X2.
>
> Thank you very much in advance.
>
> Best,
>
> Manuel
>
> --
> *Manuel Spínola, Ph.D.*
> Instituto Internacional en Conservación y Manejo de Vida Silvestre
> Universidad Nacional
> Apartado 1350-3000
> Heredia
> COSTA RICA
> mspin...@una.ac.cr
> mspinol...@gmail.com
> Teléfono: (506) 2277-3598
> Fax: (506) 2237-7036
> Personal website: Lobito de río 
> Institutional website: ICOMVIS 
>
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] About additive partitioning - adipart

2012-10-29 Thread Peter Solymos
Thiago,

Additive diversity partitions are calculated as difference between
expected diversities at subsequent levels of a given sampling
hierarchy. If levels are not nested, it is not clear how to partition
these terms. You need either 1) to come up with a defendable method of
how to additively partition diversity among crossed hierarchies, or 2)
compare diversity partitions among plant species (e.g. using it as
highest level of the hierarchy, and define all subsequent levels as
the interaction between plant species and geographically nested
levels).

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Mon, Oct 29, 2012 at 10:41 AM, Thiago Gonçalves-Souza
 wrote:
> Dear all,
>
> I'm interested in analyze alpha and beta diversity components in an
> hierarchical way using adipart function. However, my design isn't nested. I
> collected spiders on five plant species (5 different "architecture"),
> plants on different patches , patches on 12 different sites. I sampled 100
> individual plants within each locality, 20 individuals from each of the
> five plant species. In addition, I sampled 20 patches within each locality.
> So, my table is similar with this one:
>
>plant architecture patch site  1 1 1 1  2 1 1 1  3 1 1 1  4 1 2 1  5 2 2
> 1  6 2 3 1  7 2 3 1  8 2 4 1  9 3 4 1  10 3 5 1  11 3 6 1  12 3 6 1  ... ...
> ... 2  ... ... ... 2  ... ... ... 2
>
> I've run the adipart function and the following warning message appear:
>
>
> Warning message:
> In adipart.formula(species ~ ., hierar, index = "richness", nsimul = 1000) :
>   levels are not perfectly nested
>
> Is there any problem with Additive Diversity Partitioning method when the
> design isn't nested?
>
> Thank you all in advance,
> Thiago.
>
> --
> Thiago Gonçalves-Souza, Ms.
> Universidade Estadual Paulista (UNESP)
> Departamento de Zoologia e Botânica
> Programa de Pós-Graduação em Biologia Animal (Doutorado)
> E-mail alternativo: thiagoara...@gmail.com
> Home page:
>
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] looping and plotting a Lefkovitch/Leslie projection

2012-11-29 Thread Peter Solymos
Jeffrey,
Check out also the popbio package.
Cheers,
Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Thu, Nov 29, 2012 at 12:01 PM, Jeffrey Stratford
 wrote:
> These are great examples.  Thanks so much guys!
>
>
> On Thu, Nov 29, 2012 at 1:41 PM, Marc Taylor  wrote:
>
>> Hello Jeffrey.
>>
>> I have done something similar for a course on "Modelling in Conservation
>> Biology". Here is a simple example of an age-based Leslie model with three
>> classes:
>>
>> L=matrix(c(0.5, 0., 0, 1.0, 0, 0., 0.75, 0, 0), nrow=3, ncol=3)
>> time=10
>> No=c(2,0,0)
>>
>> POP=c(No)
>> Nt=No
>> for (i in 1:time){
>> Nt = drop(L %*% Nt)
>> POP=rbind(POP,Nt)
>> }
>>
>> plot(0:time,POP[,1], type="l", ylim=range(POP), ylab="numbers",
>> xlab="time")
>> for (i in 2:length(POP[1,])){
>> lines(0:time,POP[,i],col=i)
>> }
>>
>> legend(0.5*time,max(POP),
>> c(paste("class",(1:length(No,
>> lwd=1,
>> col=c(1:length(No)))
>>
>> ###Eigenvalues (info on pop trajectory and distribution)
>> sum(POP[(time),])/sum(POP[(time-1),]) # the rate of population change
>> between the last two iterations
>> x=eigen(L)
>> x
>>
>> #dominant eigen value
>> abs(x$values)
>>
>> #The stable population distribution is specified by the 1st vector. The
>> proportion of each element of the vector sum equals the corresponding class
>> proportions.
>> abs(x$vectors[,1]/sum(x$vectors[,1]))
>>
>>
>>
>>
>>
>> Another nice example from a real study is that of Crouse et al. (1987).
>> the paper comes with all the information needed to reproduce a 7 stage
>> Usher matrix model (i.e. individuals have the possibility to stay in the
>> same class and not advance).
>>
>> Crouse, Deborah T., Larry B. Crowder, and Hal Caswell. "A stage-based
>> population model for loggerhead sea turtles and implications for
>> conservation." *Ecology* 68.5 (1987): 1412-1423.
>>
>>
>>  Cheers,
>> Marc
>>
>>
>> On Thu, Nov 29, 2012 at 5:36 PM, Alan Haynes  wrote:
>>
>>> Is this what you want?
>>>
>>> Your code
>>> #the data
>>> A <- c(0,0,100, 0.4,0,0, 0,0.7,0)
>>> #create a matrix from the data
>>> L <- matrix(A, nrow=3, byrow=T)
>>> #check out the matrix
>>> L
>>> #the starting  population
>>> n_0 <- c(1000, 400, 100)
>>> #the next generation
>>> #n_1 <- L %*% n_0
>>> #check it out
>>> #n_1
>>> #run another generation
>>> #n_2 <- L %*% n_1
>>> #check it out
>>> #n_2
>>>
>>> my code:
>>>
>>> pops <- list()
>>> pops[[1]] <- n_0
>>>
>>> for(i in 2: 20){ # gives 19 generations
>>>   pops[[i]] <- L %*% pops[[i-1]] # assign your matrix mult to a list slot
>>> }
>>> pops[[20]] # the 19th generation (because the first slot is the 0
>>> generation)
>>>
>>> HTH
>>>
>>> Alan
>>>
>>> --
>>> Email: aghay...@gmail.com
>>> Mobile: +41794385586
>>> Skype: aghaynes
>>>
>>>
>>> On 29 November 2012 17:07, Jeffrey Stratford
>>> wrote:
>>>
>>> > Hi everyone,
>>> >
>>> > I have a two part population lab in my Ecology course. The first part
>>> was
>>> > showing exponential and logistic growth. For the second section, I would
>>> > like to introduce them to Leslie/Lefkovitch models. I'm able to multiply
>>> > the matrix and the age classes step by step but I would like to have
>>> them
>>> > run projections for as many generations as they choose and show the
>>> > population structure by plotting the number in each age class over time.
>>> >
>>> > I know software programs like Populus do this but I would like to have
>>> them
>>> > (and myself) learn the programming.
>>> >
>>> > Here's what I have so far:
>>> >
>>> > #the data
>>> > A <- c(0,0,100, 0.4,0,0, 0,0.7,0)
>>> > #create a matrix from the data
>>> > L <- matrix(A, nrow=3, byrow=T)
>>> > #check out the matrix
>>> > L
>>> > #the starting  population
>>> > n_0 <- c(1000, 400, 100)
>>> > #the next generation
>>> > n_1 <- L %*% n_0
>>> > #check it out
>>> > n_1
>>> > #run another generation
>>> > n_2 <- L %*% n_1
>>> > #check it out
>>> > n_2
>>> >
>>> >
>>> > Many thanks,
>>> >
>>> > Jeff
>>> > --
>>> > 
>>> > Jeffrey A. Stratford, PhD
>>> > Department of Biology and Health Sciences
>>> > Wilkes University, PA 18766 USA
>>> > 570-332-2942
>>> > http://web.wilkes.edu/jeffrey.stratford/
>>> > 
>>> >
>>> > [[alternative HTML version deleted]]
>>> >
>>> > ___
>>> > R-sig-ecology mailing list
>>> > R-sig-ecology@r-project.org
>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>> >
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ___
>>> R-sig-ecology mailing list
>>> R-sig-ecology@r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>>
>>
>>
>

Re: [R-sig-eco] removing singleton taxa

2012-12-17 Thread Peter Solymos
Kate,

To get what you want the simplest way try:

df1[,colSums(df1>0)>1,drop=FALSE]

Note the drop=FALSE argument, which makes sure that you still get a
matrix if only one species occurs in more than one sites (i.e. no
surprising vector results in the end).

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Mon, Dec 17, 2012 at 5:43 PM, Chris Howden
 wrote:
> I'm not sure if I've got the syntax quite right ((m about to rush out the
> door), but how about something like.
>
> check <- colSums(df1>0)  ~ gives U a vector of the number of sites with
> abundance > 0 for each species
> check2 <- [check>1]  ~ gives U a logical vector which is FALSE  if
> a species occurs in 0 or 1 site, TRUE otherwise
>
> df2 <- df1[,check2]~ subset those columns that are TRUE i.e.
> have a species in more than 1 site.
>
>
>
> Chris Howden B.Sc. (Hons) GStat.
> Founding Partner
> Evidence Based Strategic Development, IP Commercialisation and Innovation,
> Data Analysis, Modelling and Training
> (mobile) 0410 689 945
> (fax) +612 4782 9023
> ch...@trickysolutions.com.au
>
>
>
>
> Disclaimer: The information in this email and any attachments to it are
> confidential and may contain legally privileged information. If you are
> not the named or intended recipient, please delete this communication and
> contact us immediately. Please note you are not authorised to copy, use or
> disclose this communication or any attachments without our consent.
> Although this email has been checked by anti-virus software, there is a
> risk that email messages may be corrupted or infected by viruses or other
> interferences. No responsibility is accepted for such interference. Unless
> expressly stated, the views of the writer are not those of the company.
> Tricky Solutions always does our best to provide accurate forecasts and
> analyses based on the data supplied, however it is possible that some
> important predictors were not included in the data sent to us. Information
> provided by us should not be solely relied upon when making decisions and
> clients should use their own judgement.
>
>
> -Original Message-
> From: r-sig-ecology-boun...@r-project.org
> [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Kate Boersma
> Sent: Tuesday, 18 December 2012 11:23 AM
> To: r-sig-ecology@r-project.org
> Subject: [R-sig-eco] removing singleton taxa
>
> Hi list. I am new to R and stuck on a very simple problem. Forgive me if
> this is not the right forum for my question - feel free to refer me
> elsewhere.
>
> I have a community matrix of sites*species, and I want to remove species
> that only occur in a single site (singleton taxa). It sounds so simple,
> but I have spent hours (no joke) with google and my reference books and am
> still stuck.
>
> Here's some example code:
> Species1 = sample(0:20, 10, replace=TRUE)
> Species2 = sample(0:20, 10, replace=TRUE)
> Species3 = c(0,0,0,0,0,0,0,4,0,0)
> df1 = data.frame(Species1, Species2, Species3)
> df1
>
> I want to remove Species 3. Basically, I want R to count the number of
> non-zero cells in a column and remove the column if that number is 1 or 0.
>
> I have tried messing around with rowsums(), length(), table() and
> specnumber() in the vegan package and can't figure it out.
>
> Thanks,
> Kate
>
> --
> Kate Boersma
> PhD Candidate
> Oregon State University
> Department of Zoology
> Cordley 3029
> Corvallis, OR 97331
> kate.boer...@science.oregonstate.edu
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Standardizing data

2013-01-23 Thread Peter Solymos
Bruce,

Standardizing might not be the best way to go if you have low counts.
You can possibly assume that events follow a homogeneous Poisson
process and rate varies with night length (linear or quadratic) [Y|x ~
Poisson(phi); log(phi)=f(x)]. You can estimate corresponding
coefficients by glm(). I think controlling for night length
differences work even if you throw in other covariates as additive
effects on the log scale.

You can get sunrise /sunset times from maptools::sunriset, be careful
with timezones though.

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Wed, Jan 23, 2013 at 7:16 PM, Bruce Miller  wrote:
> Hi all,
>
> Posting this query again as no one replied.
>
> I need to create a way to standardize bat sampling data in northern
> latitudes not only by the Activity Index standardized by sample time
> (unit effort) but by the constantly changing night length.
>
> This was easy in the tropics with a straight forward application of my
> Acoustic Activity Index (AI) [Miller, B. W. 2001. A method for
> determining relative activity of free flying bats using a new activity
> index for acoustic monitoring. Acta Chiropterologica. 3: 93-105].
>
> This uses the occurrence of each species by 1-minute time blocks and is
> then standardized by "Unit Effort"  which is the total sample time each
> night in hours. With a night length that was fairly standard with only a
> small ą of 1 hr or so for seasonal variation and very small
> night-to-night change this seem OK.
>
> However looking at data from here in the northern latitudes,I need
> something else in addition.   The night length not only changes rapidly
> night-to-night during the summer, but has a very wide ą.  So using only
> AI -occurrences per each 1-minute time interval standardized by sampling
> time previously used in the tropics may not reflect a realistic
> comparison measure up here in the north.
>
> So my question is how to incorporate the length of each night into the
> AI standardized by sample time?
> How best to integrate the night length since that is one of the key
> factors constraining bat activity?
>
> */R/* seems awesome for running repetitive calculations once one has the
> code line script.  So I am trying to see how to develop such a new
> standardized /R/ code for two data sets, one a DF with the simple AI
> values that have already been standardized by unit effort and another
> that includes the ever changing sunset-sunrise data.
>
> For working out GGplot temporal activity plots by minute for each night
> it took bit of hand holding by Hadley as R does not (or at lead did not)
> "do time well".  Not sure what can be done with standardizing the data
> with changing night lengths form one night to the.   The crossover
> midnight may not be an issue for this calculation since we just need a
> total night length in decimal hours.
>
> Moon phase and % illumination is yet another issue, but not relevant at
> the moment.:-)
>
> Any and all suggestions welcome and the bats will depend on it.
>
> Bruce
>
> --
> Bruce W. Miller, Ph.D.
> Conservation Ecologist
> Neotropical Bat Projects
>
>
> office details
> 11384 Alpine Road
> Stanwood, Mi. 49346
> Phone (231) 679-6059
>
>
>
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Rao entropy with presence-absence data

2013-06-12 Thread Peter Solymos
Thomas,

1) Presence absence data means that you have cell probabilities 1/S_i for
detections and 0 for missing species in a given community i. As Zoltán also
pointed out, it is meaningful to use this, as it has the interpretation of
choosing different species from a species list (and not from a pool of
individuals).

2) Calculations never end. You might have a bad luck with your data. The
underlying algorithm (see refs on ade4:::divcmax help page, I am referring
to the underlying functions and not the wrapper from FD) is not very robust
and it often fails to find a solution for the max Rao's quadratic entropy
value. It is legitimate to use this with 0/1 data, and it hangs up even
with count based probabilities quite often according to my experience. I
haven't spend much time figuring out a fix, but there are 2 loops in the
code without a break. Below is a function where you can specify maxit
argument so that it quits before the end of times.

I am cc-in to Simon Penel, maintainer of ade4, as this seems to me as a
possible feature they might want to add.

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca

## -- code starts here

divcmax_2 <-
function (dis, epsilon = 1e-08, comment = FALSE, maxit = 1000)
{
if (!inherits(dis, "dist"))
stop("Distance matrix expected")
if (epsilon <= 0)
stop("epsilon must be positive")
if (!is.euclid(dis))
stop("Euclidean property is expected for dis")
D2 <- as.matrix(dis)^2/2
n <- dim(D2)[1]
result <- data.frame(matrix(0, n, 4))
names(result) <- c("sim", "pro", "met", "num")
relax <- 0
x0 <- apply(D2, 1, sum)/sum(D2)
result$sim <- x0
objective0 <- t(x0) %*% D2 %*% x0
if (comment == TRUE)
print("evolution of the objective function:")
xk <- x0

COUNTER <- 0

repeat {

if (COUNTER > maxit)
stop("maxit reached")

repeat {

if (COUNTER > maxit)
stop("maxit reached")

maxi.temp <- t(xk) %*% D2 %*% xk
if (comment == TRUE)
print(as.character(maxi.temp))
deltaf <- (-2 * D2 %*% xk)
sature <- (abs(xk) < epsilon)
if (relax != 0) {
sature[relax] <- FALSE
relax <- 0
}
yk <- (-deltaf)
yk[sature] <- 0
yk[!(sature)] <- yk[!(sature)] - mean(yk[!(sature)])
if (max(abs(yk)) < epsilon) {
break
}
alpha.max <- as.vector(min(-xk[yk < 0]/yk[yk < 0]))
alpha.opt <- as.vector(-(t(xk) %*% D2 %*% yk)/(t(yk) %*%
D2 %*% yk))
if ((alpha.opt > alpha.max) | (alpha.opt < 0)) {
alpha <- alpha.max
} else {
alpha <- alpha.opt
}
if (abs(maxi.temp - t(xk + alpha * yk) %*% D2 %*%
(xk + alpha * yk)) < epsilon) {
break
}
xk <- xk + alpha * yk

COUNTER <- COUNTER + 1

}
if (prod(!sature) == 1) {
if (comment == TRUE)
print("KT")
break
}
vectD2 <- D2 %*% xk
u <- 2 * (mean(vectD2[!sature]) - vectD2[sature])
if (min(u) >= 0) {
if (comment == TRUE)
print("KT")
break
}
else {
if (comment == TRUE)
print("relaxation")
satu <- (1:n)[sature]
relax <- satu[u == min(u)]
relax <- relax[1]
}

COUNTER <- COUNTER + 1

}
if (comment == TRUE)
print(list(objective.init = objective0, objective.final =
maxi.temp))
result$num <- as.vector(xk, mode = "numeric")
result$num[result$num < epsilon] <- 0
xk <- x0/sqrt(sum(x0 * x0))

COUNTER <- 0

repeat {

if (COUNTER > maxit)
stop("maxit reached")

yk <- D2 %*% xk
yk <- yk/sqrt(sum(yk * yk))
if (max(xk - yk) > epsilon) {
xk <- yk
}
else break

COUNTER <- COUNTER + 1

}
x0 <- as.vector(yk, mode = "numeric")
result$pro <- x0/sum(x0)
result$met <- x0 * x0
restot <- list()
restot$value <- divc(cbind.data.frame(result$num), dis)[,
1]
restot$vectors <- result
return(restot)
}

## -- code ends here


On Wed, Jun 12, 2013 at 8:13 AM, Zoltan Botta-Dukat <
botta-dukat.zol...@okologia.mta.hu> wrote:

> Dear Thomas,
>
>
>
>
>> Is it possible, and meaningful, to calculate  Rao`s quadratic entropy
>> index (function dbFD from package FD) with presence-absence data?
>>
> Yes, it is possible and meaningful. In this case it is simply the mean
> distance between the occurring species.

Re: [R-sig-eco] Rao entropy with presence-absence data

2013-06-13 Thread Peter Solymos
You should be able to calculate any distance matrix from binary trait data
using stats:::dist, vegan:::vegdist etc. Once you have a valid dist object, you
can use ade4:::divc instead of the wrapper.

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Thu, Jun 13, 2013 at 2:11 AM, Thomas Möckel
wrote:

> Hello,
> Thank you very much for you answers. I was wondering what happens if I
> have abundance data for my species matrix but binary trait values?
> Because it seems the function has some problems with this combination.
>
> Regards,
>  Thomas
>
> On 6/12/2013 5:17 PM, Peter Solymos wrote:
> > Thomas,
> >
> > 1) Presence absence data means that you have cell probabilities 1/S_i for
> > detections and 0 for missing species in a given community i. As Zoltán
> also
> > pointed out, it is meaningful to use this, as it has the interpretation
> of
> > choosing different species from a species list (and not from a pool of
> > individuals).
> >
> > 2) Calculations never end. You might have a bad luck with your data. The
> > underlying algorithm (see refs on ade4:::divcmax help page, I am
> referring
> > to the underlying functions and not the wrapper from FD) is not very
> robust
> > and it often fails to find a solution for the max Rao's quadratic entropy
> > value. It is legitimate to use this with 0/1 data, and it hangs up even
> > with count based probabilities quite often according to my experience. I
> > haven't spend much time figuring out a fix, but there are 2 loops in the
> > code without a break. Below is a function where you can specify maxit
> > argument so that it quits before the end of times.
> >
> > I am cc-in to Simon Penel, maintainer of ade4, as this seems to me as a
> > possible feature they might want to add.
> >
> > Cheers,
> >
> > Peter
> >
> > --
> > Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
> > soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
> > Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
> > Boreal Avian Modelling Project, http://www.borealbirds.ca
> >
> > ## -- code starts here
> >
> > divcmax_2 <-
> > function (dis, epsilon = 1e-08, comment = FALSE, maxit = 1000)
> > {
> >  if (!inherits(dis, "dist"))
> >  stop("Distance matrix expected")
> >  if (epsilon <= 0)
> >  stop("epsilon must be positive")
> >  if (!is.euclid(dis))
> >  stop("Euclidean property is expected for dis")
> >  D2 <- as.matrix(dis)^2/2
> >  n <- dim(D2)[1]
> >  result <- data.frame(matrix(0, n, 4))
> >  names(result) <- c("sim", "pro", "met", "num")
> >  relax <- 0
> >  x0 <- apply(D2, 1, sum)/sum(D2)
> >  result$sim <- x0
> >  objective0 <- t(x0) %*% D2 %*% x0
> >  if (comment == TRUE)
> >  print("evolution of the objective function:")
> >  xk <- x0
> >
> >  COUNTER <- 0
> >
> >  repeat {
> >
> >  if (COUNTER > maxit)
> >  stop("maxit reached")
> >
> >  repeat {
> >
> >  if (COUNTER > maxit)
> >  stop("maxit reached")
> >
> >  maxi.temp <- t(xk) %*% D2 %*% xk
> >  if (comment == TRUE)
> >  print(as.character(maxi.temp))
> >  deltaf <- (-2 * D2 %*% xk)
> >  sature <- (abs(xk) < epsilon)
> >  if (relax != 0) {
> >  sature[relax] <- FALSE
> >  relax <- 0
> >  }
> >  yk <- (-deltaf)
> >  yk[sature] <- 0
> >  yk[!(sature)] <- yk[!(sature)] - mean(yk[!(sature)])
> >  if (max(abs(yk)) < epsilon) {
> >  break
> >  }
> >  alpha.max <- as.vector(min(-xk[yk < 0]/yk[yk < 0]))
> >  alpha.opt <- as.vector(-(t(xk) %*% D2 %*% yk)/(t(yk) %*%
> >  D2 %*% yk))
> >  if ((alpha.opt > alpha.max) | (alpha.opt < 0)) {
> >  alpha <- alpha.max
> >  } e

Re: [R-sig-eco] offset function in glm.nb

2013-06-14 Thread Peter Solymos
Matias,

The offset term is processed as part of parsing the formula, which results
in a vector of length of the response. Using a vector should not be a
problem.

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Fri, Jun 14, 2013 at 4:39 AM, Matias Ledesma wrote:

> Hi,
>
> I have a question about the offset function in glm.nb. As I understod it
> is possible to use rates in the respons variable in glm.nb with the
> function offset.
>
> My model, modelA1F<-glm.nb(Tot.dameggs~STATN+offset(log(FECUND)))
>
> Tot.dameggs= total amount of damaged embryos per individual
> STATN= Station
> FECUND= total number of embryos per individual (huge varability)
>
> My question is if it is possible in this case when FECUND isnt the same
> for all individuals.
>
> Thanks
> /Matias
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] convergence problems for zero-inflated model

2013-09-23 Thread Peter Solymos
Laura,

Hacking the function is straightforward. Change this line:

hessian <- control$hessian

into

hessian <- FALSE

and then this one:

vc <- -solve(as.matrix(fit$hessian))

as

vc <- diag(1, length(fit$par), length(fit$par))

Then you take care of the unexported model_offset_2 function as
pscl:::model_offset_2 and you are good to go.

*BUT*, is this really what you want? Maybe you want to get around
asymptotics by doing bootstrap.

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Mon, Sep 23, 2013 at 9:29 AM, Lee, Laura  wrote:

> Hi all!
>
> I'm using a zero-inflated model using the function 'zeroinfl' and running
> into convergence problems (system is computationally singular). How can I
> modify the code so it does not calculate a Hessian matrix?
>
> Thanks,
>
> Laura
>
>
>
> E-mail correspondence to and from this address is subject to the North
> Carolina Public Records Law and may be disclosed to third parties unless
> the context is exempt by statute or other regulation.
>
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Diversity on standardised densities in R

2013-10-25 Thread Peter Solymos
Hello,

A parametric model (e.g. Clench) would allow both intrapolation and
extrapolation. There are some caveats of course: (1) these models arose in
the temporal accumulation sense, spatial accumulation is usually calculated
for randomized data, which is an assumption that individuals in the
sampling population are sufficiently mixed; (2) extrapolation from few
individuals won't be highly accurate; (3) this won't solve the N=0
situation, not even N=1.

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Fri, Oct 25, 2013 at 1:10 AM, Vít Syrovátka  wrote:

> Hi,
>
> vegan::rrarefy() is the right way to do this.
>
> However, I can't imagine, how to take 50 random individuals from a sample
> with less then 50 individuals.
> One doesn't know, what species would be collected in case of a larger
> sampling effort, R doesn't know either.
> I would suggest doing this only for samples with more individuals, or
> taking a smaller number of individuals.
>
> Hope that helps,
> Vit
>
> Dne 2013-10-24 21:08, Paris napsal:
>
>> By the way, I think I could use (if Zoop tha name of the dataset)
>> rrarefy(Zoop, sample=50).
>> However, in this case it will not allow me since there are samples with
>> zero
>> individuals.
>> Even if I delete those, there will be some with less than 50 indiv. But I
>> would prefer to standardise effort for 50 individuals. This function, or
>> at
>> least in this form, does not allow that.
>>
>>
>>
>> --
>> View this message in context:
>> http://r-sig-ecology.471788.**n2.nabble.com/Diversity-on-**
>> standardised-densities-in-R-**tp7578474p7578475.html
>> Sent from the r-sig-ecology mailing list archive at Nabble.com.
>>
>> __**_
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/**listinfo/r-sig-ecology
>>
>
> __**_
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/**listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Logistic regression with repeated measures ?

2013-11-27 Thread Peter Solymos
Marie,

Your problem and data seems to me a resource selection problem with matched
use-availability design. Estimating procedure for that design is discussed
in Lele and Keim (2006, Ecology 87:3021--3028) and implemented in the
ResourceSelection package: rspf function, see description of argument 'm'
for specifying matched points for individual birds. The output is a model
for probability of selection given the distribution of environmental
covariates available for these specific individuals.

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Wed, Nov 27, 2013 at 2:29 PM, marieline gentes wrote:

> Dear list,
>
> I am a bit new to logistic regressions. I am working with GPS data from
> GPS-tracked birds. My objective is to investigate whether various
> covariates influence the probabilty of visiting specific habitats. Each
> bird has visited many habitats during the course of its GPS tracking.
>
> Here is a small sample of the data:
>
> Bird.ID Year Sex body.index Recapt PrevWeek.Rain AgriYes AgriNo UrbanYes
> UrbanNo
> CAL 2010 M 21.99155 13-May-10 1.43 0 100 0 100
> CAO 2011 F -19.91797 27-Apr-11 4.23 54 46 9 91
> CFL 2010 F 25.61063 12-May-10 2.16 31 69 2 98
> CFP 2010 M -30.65814 13-May-10 1.43 60 40 0 100
>
> I understand that I have to use logistic regression, with a cbind code,
> because my response variable is not binary anymore (the response is a
> summary of the success vs failures).
>
> Based on R tutorials, I am thinking about codes that would look like this:
>
> Agri.RainSex = glm(cbind(AgriYes, AgriNo) ~ PrevWeekRain + Sex + Year +
> Year*Sex,family=binomial (logit), data=mydata)
> However, contrary to the examples I see online, my data are from
> individual birds, not from groups of birds. If I had been using the raw
> binary data, each bird would have 100 hundred lines (I converted the
> percentages into success/failures)(all my % are weighted the same - that is
> not a problem here). Am I supposed to take into account some kind of
> repeated measure in my model ?
>
> Notes:
> For people who are thinking about overdispersed data: my data does not
> seem to be overdispersed. But I will inspect that after I am confident that
> my basic model is ok. So this question is about dealing with repeated
> measured, not about adding a random intercept for overdispersion.
>
> For people who are working with habitat selection models: this is not the
> case here. We are not working on resource selection. We want to fit a
> simple logistic regression on this data as a part of data exploration. This
> ultimate goal is to link contaminant burden with the proportion of time
> spent in a given habitat.
>
> Thank you for your advice,
>
> Marie
> [[alternative HTML version deleted]]
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Logistic regression with repeated measures ?

2013-11-27 Thread Peter Solymos
Chris,

It is not random effect strictly speaking, but something like that. The
problem is this: RSF models are often constructed as mixed models with
random intercept. But it is known that the intercept is a function of the
other parameters and the available (background) distribution. So a random
intercept is then hard to justify, but still used quite a bit.

The matched design in RSF/RSPF is analogous to the matched case-control
design in logistic regression. So one can specify a global availability
(all pixels in a landscape are equally available for each individual), or
local availability (used points and available points in the vicinity are
matched for each individual separately).

I hope this clarifies the issue. Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Wed, Nov 27, 2013 at 4:50 PM, Chris Howden
wrote:

> Hi Peter,
>
> Does it have the ability to fit random effects? Or some other way to
> address the pseudoreplication expected in RSF studies using GPS fix data
> with little time between fixes ? (Just had a quick look at the rspf
> package and I couldn't see any)
>
>
>
> Chris Howden B.Sc. (Hons) GStat.
> Founding Partner
> Evidence Based Strategic Development, IP Commercialisation and Innovation,
> Data Analysis, Modelling and Training
> (mobile) 0410 689 945
> (skype) chris.howden
> ch...@trickysolutions.com.au
>
>
>
>
> Disclaimer: The information in this email and any attachments to it are
> confidential and may contain legally privileged information. If you are
> not the named or intended recipient, please delete this communication and
> contact us immediately. Please note you are not authorised to copy, use or
> disclose this communication or any attachments without our consent.
> Although this email has been checked by anti-virus software, there is a
> risk that email messages may be corrupted or infected by viruses or other
> interferences. No responsibility is accepted for such interference. Unless
> expressly stated, the views of the writer are not those of the company.
> Tricky Solutions always does our best to provide accurate forecasts and
> analyses based on the data supplied, however it is possible that some
> important predictors were not included in the data sent to us. Information
> provided by us should not be solely relied upon when making decisions and
> clients should use their own judgement.
>
>
> -Original Message-
> From: r-sig-ecology-boun...@r-project.org
> [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Peter Solymos
> Sent: Thursday, 28 November 2013 10:33 AM
> To: marieline gentes
> Cc: r-sig-ecology@r-project.org
> Subject: Re: [R-sig-eco] Logistic regression with repeated measures ?
>
> Marie,
>
> Your problem and data seems to me a resource selection problem with
> matched use-availability design. Estimating procedure for that design is
> discussed in Lele and Keim (2006, Ecology 87:3021--3028) and implemented
> in the ResourceSelection package: rspf function, see description of
> argument 'm'
> for specifying matched points for individual birds. The output is a model
> for probability of selection given the distribution of environmental
> covariates available for these specific individuals.
>
> Cheers,
>
> Peter
>
> --
> PC)ter SC3lymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
> soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com Alberta
> Biodiversity Monitoring Institute, http://www.abmi.ca Boreal Avian
> Modelling Project, http://www.borealbirds.ca
>
>
> On Wed, Nov 27, 2013 at 2:29 PM, marieline gentes
> wrote:
>
> > Dear list,
> >
> > I am a bit new to logistic regressions. I am working with GPS data
> > from GPS-tracked birds. My objective is to investigate whether various
> > covariates influence the probabilty of visiting specific habitats.
> > Each bird has visited many habitats during the course of its GPS
> tracking.
> >
> > Here is a small sample of the data:
> >
> > Bird.ID Year Sex body.index Recapt PrevWeek.Rain AgriYes AgriNo
> > UrbanYes UrbanNo CAL 2010 M 21.99155 13-May-10 1.43 0 100 0 100 CAO
> > 2011 F -19.91797 27-Apr-11 4.23 54 46 9 91 CFL 2010 F 25.61063
> > 12-May-10 2.16 31 69 2 98 CFP 2010 M -30.65814 13-May-10 1.43 60 40 0
> > 100
> >
> > I understand that I have to use logistic regression, with a cbind
> > code, because my response variable is not binary anymore (the response
> > is a summary of the success vs failures).
> >
> > Bas

Re: [R-sig-eco] Logistic regression with repeated measures ?

2013-11-27 Thread Peter Solymos
So you want like a dose-response relationship using proportions of time as
response if I understand correctly. And you are worried that some
overdispersion might be present. The beta regression deals with continuous
proportions as outcome and you can model overdispersion (even as a function
of covariates) in the betareg package. The only restriction there is that
you can't have 0 or 1 as response (some of the package authors have papers
about zero-and-one-inflated beta regression which does not seem overly
complicated to implement in R).

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Wed, Nov 27, 2013 at 6:08 PM, marieline gentes wrote:

> Hello Chris and Peter,
>
> Thank you very much for your quick answers. But as I pointed out, this
> study really is not a resource selection project (sorry I did not provide
> more background). We are not really interested in knowing why the birds are
> going to a specific habitat rather than an other habitat. The global aim is
> to investigate the effects of habitat use on contaminant profiles, and the
> target audience is ecotoxicologistsThe question I presented here is
> just a very small portion of the analysis : we just want to provide a very
> basic description of whether males and females spend the same proportion of
> their time in each of our habitats, if year has an effect, and a few other
> very basic descriptive trends.
>
> The bulk of the analysis focuses on the links between % contaminant
> (response variable) and habitat use (independant variable), and I will be
> using a logit transform on % contaminants followed by linear models, as
> described in David I. Warton and Francis K. C. Hui 2011. The arcsine is
> asinine: the analysis of proportions in ecology. Ecology 92:3–10.
> http://dx.doi.org/10.1890/10-0340.1.
>
> So back to my original question: Chris - yes, it seems that I won't get
> away without going back to the original binary data if I want to take into
> account the repeated/random effects created by a bird having multiple data
> points.
>
> I am also wondering if it would be terribly wrong to simply use a logit
> transform on the % of time in each habitat ? I would then have only one
> data point for each bird. This mean the data would be treated as a real
> proportion, not as binomial anymore. But since the sampling intensity is
> the same for each bird, i.e., the same denominator (the same total number
> of GPS positions for each bird), maybe the results would be almost
> identical ?
>
> Thank you so much again for your time,
>
> Marie
>
>
>
>
>   On Wednesday, November 27, 2013 7:19:47 PM, Peter Solymos <
> soly...@ualberta.ca> wrote:
>  Chris,
>
> It is not random effect strictly speaking, but something like that. The
> problem is this: RSF models are often constructed as mixed models with
> random intercept. But it is known that the intercept is a function of the
> other parameters and the available (background) distribution. So a random
> intercept is then hard to justify, but still used quite a bit.
>
> The matched design in RSF/RSPF is analogous to the matched case-control
> design in logistic regression. So one can specify a global availability
> (all pixels in a landscape are equally available for each individual), or
> local availability (used points and available points in the vicinity are
> matched for each individual separately).
>
> I hope this clarifies the issue. Cheers,
>
> Peter
>
> --
> Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
> soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
> Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
> Boreal Avian Modelling Project, http://www.borealbirds.ca
>
>
> On Wed, Nov 27, 2013 at 4:50 PM, Chris Howden <
> ch...@trickysolutions.com.au> wrote:
>
> Hi Peter,
>
> Does it have the ability to fit random effects? Or some other way to
> address the pseudoreplication expected in RSF studies using GPS fix data
> with little time between fixes ? (Just had a quick look at the rspf
> package and I couldn't see any)
>
>
>
> Chris Howden B.Sc. (Hons) GStat.
> Founding Partner
> Evidence Based Strategic Development, IP Commercialisation and Innovation,
> Data Analysis, Modelling and Training
> (mobile) 0410 689 945
> (skype) chris.howden
> ch...@trickysolutions.com.au
>
>
>
>
> Disclaimer: The information in this email and any attachments to it are
> confidential and may contain legally privileged information. If you are
> not th

Re: [R-sig-eco] beta regression error

2013-12-03 Thread Peter Solymos
Attila,

See paper and R code by Millar et al. 2011 for a solution based on 'glm':
http://www.esapubs.org/archive///ecol/E092/146/

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Tue, Dec 3, 2013 at 9:23 AM, Attila Lengyel  wrote:

> Dear All,
>
> I have a problem with 'betareg' function of 'betareg' package, would you,
> please, help me? I am trying to model compositional similarity on range (0;
> 1) between pairs of sample units as a function of pairwise geographical
> distances, and I often get an error message like this:
>
> "Error in chol.default(K) :
>   the leading minor of order 3 is not positive definite
> In addition: Warning message:
> In sqrt(wpp) : NaNs produced
> Error in chol.default(K) :
>   the leading minor of order 3 is not positive definite
> In addition: Warning messages:
> 1: In betareg.fit(X, Y, Z, weights, offset, link, link.phi, type, control)
> :
>   failed to invert the information matrix: iteration stopped prematurely
> 2: In sqrt(wpp) : NaNs produced"
>
> The problem occurs with any link function. I read that this error is caused
> when my matrix is not positive definite. But how should I avoid this if
> these are the data I have?
>
> Thank you and best regards,
>
> Attila Lengyel
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Mean distance values (and errors) between groups of samples using vegan

2013-12-16 Thread Peter Solymos
Andrés,

To get statistics other than the mean (SD ~ error as you wrote) you can
stack the dist object (e.g. stack.dist in mefa pkg with dim.names = TRUE)
and then calculate statistics for subsets based on your grouping variable.

Cheers,

Peter

--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca


On Mon, Dec 16, 2013 at 12:23 PM, Jari Oksanen  wrote:

> Function meandist() should do this. Its name is kind of hint.
>
> Cheers, Jari Oksanen
> On 16/12/2013, at 21:16 PM, Andres Mellado Diaz wrote:
>
> > Dear list memnbers,
> >
> > I would greatly appreciate any suggestion on how to get averaged
> distance values (and errors) between groups of samples using vegan
> (vegdist, dist,...).
> >
> > As an example, I have a faunal table (samples by taxa, "mMCINVmmh") and
> some grouping factors (for example: "gradreg_mmh") that I want to test,
> like this:
> >
> >
> >> vegdist(mMCINVmmh, method="bray", binary=FALSE, diag=F,
> upper=F)->vegdist1
> >
> >> vegdist1
> >
> >  TR_1_2TR_1_3TR_2_2TR_2_3TR_3_1TR_3_2
> > TR_1_3 0.7361704
> > TR_2_2 0.1509007 0.7785157
> > TR_2_3 0.6465259 0.4095644 0.6832353
> > TR_3_1 0.8717468 0.9572968 0.8551546 0.9373980
> > TR_3_2 0.6925163 0.7197110 0.7428020 0.4742819 0.7800620
> > TR_3_3 0.8166542 0.7199418 0.8365397 0.6215122 0.8582932 0.5364242
> >
> >> gradreg_mmh
> >
> > [1] 0 0 3 3 5 5 5
> >
> > Levels: 0 3 5
> >
> >>
> >
> > So I want to get the pairwise mean distances and errors between groups
> 0, 3 and 5...
> >
> >
> > many thanks in advance,
> >
> > Andrés
> >
> > --
> > Andrés Mellado Díaz
> >
> > Centre for Hydrological Studies CEH-CEDEX
> > Water Quality Department
> > Pº bajo de la Virgen del Puerto, 3
> > 28005, Madrid
> > SPAIN
> >
> >
> > --
> > Andrés Mellado Díaz
> >
> > Centre for Hydrological Studies CEH-CEDEX
> > Water Quality Department
> > Pº bajo de la Virgen del Puerto, 3
> > 28005, Madrid
> > SPAIN
> >
> > ___
> > R-sig-ecology mailing list
> > R-sig-ecology@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] FW: inconsistent p-values in 'indval'

2014-01-01 Thread Peter Solymos
Eda,

How many permutations do you use? Have you tried setting the RNG seed via
set.seed() ? Also, if you have borderline p-values that change from run to
run it might indicate not so strong discrimination by the given clustering
of sites.

Cheers,

Peter

--
Péter Sólymos
780-492-8534 | soly...@ualberta.ca | http://psolymos.github.com
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca


On Wed, Jan 1, 2014 at 7:14 PM, Addicott Eda <
eda.addic...@science.dsitia.qld.gov.au> wrote:

> Why do I get inconsistent p-values for indicator values of species when
> I run IndVal (in the labsdv package) on exactly the same dataset? I have
> tested this with a small dataset (40 samples and 55 species) and a
> larger dataset (334 samples and 545 species) and on different numbers of
> clusters ranging from 2 to 8. The 'p' values change enough with
> different test runs to affect the number of significant species in a
> cluster.
>
> thanks
>
> Eda Addicott
> Principal Botanist, Queensland Herbarium,
> Australian Tropical Herbarium
> ph:  07 4232 1807 fax: 07 4232 1842
> Email: eda.addic...@qld.gov.au
>
> Australian Tropical Herbarium
> Sir Robert Norman Building (E2) JCU Cairns Campus
> P.O.Box 6811
> Cairns,  QLD 4870
>
>
> * Disclaimer  **...{{dropped:28}}
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] stochastic population models

2014-02-23 Thread Peter Solymos
Jeff,

I am not sure why you need 100 random numbers for r and K, but if your goal
is to get stochastic state-space model, you need to define the error term
as a separate parameter and run the loop 100 times with the *same* fixed
parameter values. When you do this, then you need to be aware of
parametrization (e.g. Normal error on the log scale is what people usually
use). Here is logistic Ricker state-space model:

K <- 1
r <- 0.3
sigma <- 0.2
T <- 50
N0 <- 10

N <- numeric(T)
N[1] <- N0
for(i in 2:T) {
N[i] <- exp(r * (1 - N[i - 1]/K) + log(N[i - 1]) + rnorm(1, 0, sigma))
}
plot(N, type="b")

Wrapped in a function, you get:

Ricker <- function(r, K, sigma, T, N0) {
N <- numeric(T)
N[1] <- N0
for(i in 2:T) {
N[i] <- exp(r * (1 - N[i - 1]/K) + log(N[i - 1]) + rnorm(1, 0,
sigma))
}
N
}

set.seed(1234)
NN <- replicate(1000, Ricker(r=0.3, K=10^4, sigma=0.2, T=50, N0=10))
matplot(NN, type="l", col=1, lty=1)
lines(rowMeans(NN), col=2, lwd=2)
abline(h=K, col=4, lty=2)

HTH,

Peter


--
Péter Sólymos
780-492-8534 | soly...@ualberta.ca | http://psolymos.github.com
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca


On Sun, Feb 23, 2014 at 9:44 PM, Stratford, Jeffrey <
jeffrey.stratf...@wilkes.edu> wrote:

> Hi everyone,
>
> I would like to create stochastic population models for an undergraduate
> course.
>
> The goal would be to have students run models, record, results, change
> parameters, and make inferences on changing the effects.
>
> I understand how to draw from distributions, I hit a knowledge wall with
> loops.
>
> Here's what I have for a logistic model to predict N_t (not dN/dt)
>
> K <- rnorm(100, 1, 400) # fluctuating environment
> r <- rnorm(100, 1, 2) # variable reproductive potential
> t <- 50
> N0 <- 10 # start with 10 individuals
>
> for (i in 1:t) { N[,i+1] <- (N0*K)/(N0+((K-N0)^exp(-r*t)))}
> plot(t, Nt)
>
> This doesn't work but I think I'm getting close.
>
> Any help would be much appreciated.
>
> Thanks,
>
> Jeff
>
> --
> 
> Jeffrey A. Stratford, PhD
> Department of Biology and Health Sciences
> Wilkes University, PA 18766 USA
> 570-332-2942
> http://web.wilkes.edu/jeffrey.stratford/
> 
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] error relating to rda function in vegan

2014-03-09 Thread Peter Solymos
Try using is.na (missing value) instead of is.nan (not a number).

Peter

--
Péter Sólymos
780-492-8534 | soly...@ualberta.ca | http://psolymos.github.com
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca


On Sun, Mar 9, 2014 at 10:23 PM, Rajendra Mohan panda  wrote:

> Dear members
>
> I have been performing rda in vegan using following code which works with
> one set of data but show errors in another set of data with same syntax:
>
> modH <- rda(decostand (dataA, "hell", na.rm = TRUE) ~ ., data = dataB,
> na.action =na.omit, scale = TRUE)
>
> I get following error continously
> Error in svd(Xbar, nu = 0, nv = 0) : infinite or missing values in 'x'
>
> Even if I use
> mydata<- scale(mydata)
> mydata[is.nan(mydata)] <- 0
>
> The error continues
>
> However, the rda without transformation is possible in both set of data
>
> I will be grateful to your kind help and guidance.
>
> With best Regards
> Rajendra M Panda
> SWR, IIT KGP
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] data structures for ecological data

2014-11-19 Thread Peter Solymos
Jason,

The segments in 'mefa' just add a 3rd dimension to the object, but that
does not limit accessing the stored information. Sample attributes can have
spatial information, but it is not specifically designed to support spatial
analysis. More concrete feature requests are welcome.

There is an update of 'mefa' called 'mefa4', and also Steve Walker's
'multitable' package, none of them comes with pre-built spatial analytics
capabilities but might worth checking out.

Cheers,

Peter

--
Péter Sólymos
780-492-8534 | soly...@ualberta.ca | https://sites.google.com/site/psolymos/
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca

On Wed, Nov 19, 2014 at 2:26 PM, Hyatt Green  wrote:

> Check out the picante package. That might help...
>
> Good luck,
> Hyatt
>
> On Wed, Nov 19, 2014 at 3:57 PM, Law, Jason 
> wrote:
>
> > Hello all,
> >
> > This is a general question about data structures for what I'll call
> > generically 'organism occurrence data'. Most of the packages for dealing
> > with ecology/phylogenetic data in R deal with some combination of
> > occurrence data (sample unit x taxa), site/sample data (data about the
> > sampling units), trait data (data about the taxa), and taxonomic (some
> kind
> > of tree, taxonomic or phylogenetic). I started to look around and see if
> > there was a package that provided data structures and some generic
> > functionality for manipulating this kind of data in R; I think I'm
> looking
> > for the "organism occurrence" equivalent of the "sp" package for spatial
> > data. The mefa package does some of this, but the idea of 'segments'
> > appears to introduce some limitations. And it doesn't have things like
> > accessors that I would expect for data structures (coords for
> > SpatialDataFrames). The phylobase package does this for phylogenetic
> data,
> > but I'm looking for something that would include occurrence data.
> >
> > Is there currently a package that provides data structures (other than
> > mefa) for this type of data. If not, has anyone ever talked about trying
> to
> > put something like this together?
> >
> > Jason Law
> > Statistician, City of Portland
> > Water Pollution Control Laboratory
> > 6543 N Burlington Ave,
> > Portland, OR
> >
> > 503-823-1038
> > jason@portlandoregon.gov
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-ecology mailing list
> > R-sig-ecology@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> >
>
>
>
> --
> Hyatt Green, PhD
> Assistant Professor of Microbiology
> Department of Environmental and Forest Biology
> SUNY-ESF
> 1 Forestry Drive
> Syracuse, NY 13210
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] multipart

2015-04-07 Thread Peter Solymos
Hi Saifi,

Here is how you can set up your design variables to be used in the formula
interface of multipart() or adipart() in vegan. You need to adjust the
settings and make sure that the results make sense, because you know the
data.

library(vegan)
# x <- structure(...) # just copied your data frame from dput()
x <- as.matrix(x)
z <- data.frame(
transect=1:30,
site=rep(1:10, each=3),
treatment=as.factor(rep(c("grazed","protected"), each=15)))
(res <- multipart(x ~ transect + site + treatment, z))

Cheers,

Peter


--
Péter Sólymos
780-492-8534 | soly...@ualberta.ca | peter.solymos.org
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca

On Tue, Apr 7, 2015 at 1:22 PM, Merdas Saifi  wrote:

> 2015-04-07 20:33 UTC+02:00, Sarah Goslee :
> > Hi,
> >
> > You sent this just to me, not to the list. The list does not allow
> > attachments. Please follow the suggestions already made and provide
> > example data and code *in the body of your email*.
> >
> >
> > Without a reproducible example that includes some sample data (fake is
> > fine), the code you used, and some clear idea of what output you
> > expect, it's impossible to figure out how to help you. Here are some
> > suggestions for creating a good reproducible example:
> >
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> >
> > Sarah
> >
> > On Tue, Apr 7, 2015 at 2:15 PM, Merdas Saifi  wrote:
> >> 2015-04-07 17:11 UTC+02:00, Sarah Goslee :
> >>> Please provide a reproducible example including sample data and the
> >>> code you've already tried.
> >>>
> >>> Without a reproducible example that includes some sample data (fake is
> >>> fine), the code you used, and some clear idea of what output you
> >>> expect, it's impossible to figure out how to help you. Here are some
> >>> suggestions for creating a good reproducible example:
> >>>
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> >>>
> >>> Sarah
> >>>
> >>> On Tue, Apr 7, 2015 at 11:04 AM, Merdas Saifi 
> >>> wrote:
>  Hello,
> 
>  I'm trying to perform Multiplicative Diversity Partitioning with
>  multipart function in R, i've 30 transects with frequency or presence
>  absence data, 15 of them are grazed and 15 are protected, each 03
>  transects forms a site, so i want to analyse beta diversity at
>  multiple spatial scales to quantify among-transects and among-sites.
> 
>  Many thanks,
> 
>  Saifi
> 
> 
> 
> >>>
> >>> --
> >>> Sarah Goslee
> >>> http://www.functionaldiversity.org
> >>>
> >>
> >> Hello Sarah,
> >>
> >> Thank you for your reply
> >>
> >> I want to analyse beta diversity at multiple spatial scales
> >> among-transects and among-sites in grazed and protected areas.
> >> (exemple attached)
> >> i tried with following the example using (mite) dataset, but i did not
> >> know how to do?
> >>
> >> Best regards
> >>
> >> Saifi
> >>
> >
> > --
> > Sarah Goslee
> > http://www.functionaldiversity.org
> >
> Hello,
>
> I want to analyse beta diversity at multiple spatial scales
> among-transects and among-sites in grazed and protected areas.
> i tried with following the example using (mite) dataset, but i did not
> know how to do in the code?
>
> This is an example of data frame containing 30 transects and 101
> species, I've 02 traitements (grazed and protected areas)  the first
> 15 transects are grazed and the second 15 transects are protected,
> each 03 transects in each traitement forms a sites, so 5 sites grazed
> and 5 sites protected.
>
>
> > dput(dataframe)
> structure(list(Species1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 1L, 0L, 0L, 0L, 0L, 0L,
> 0L, 0L, 0L, 0L, 0L), Species2 = c(0L, 1L, 0L, 1L, 1L, 0L, 0L,
> 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 2L, 7L, 8L, 3L, 4L, 8L, 1L, 4L,
> 0L, 1L, 0L, 3L, 0L, 0L, 2L), Species3 = c(2L, 0L, 1L, 0L, 0L,
> 2L, 1L, 0L, 1L, 0L, 0L, 2L, 1L, 1L, 2L, 0L, 3L, 1L, 1L, 3L, 0L,
> 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 6L), Species4 = c(3L, 0L, 1L,
> 2L, 2L, 2L, 3L, 9L, 13L, 1L, 1L, 2L, 0L, 0L, 0L, 4L, 0L, 0L,
> 0L, 3L, 0L, 57L, 0L, 10L, 0L, 0L, 0L, 0L, 0L, 0L), Species5 = c(0L,
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> 0L, 0L, 5L, 0L, 0L, 4L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Species6 = c(56L,
> 20L, 3L, 15L, 10L, 25L, 55L, 2L, 0L, 17L, 45L, 32L, 12L, 15L,
> 14L, 74L, 20L, 25L, 23L, 32L, 7L, 63L, 83L, 85L, 210L, 167L,
> 157L, 199L, 121L, 74L), Species7 = c(4L, 8L, 6L, 3L, 6L, 12L,
> 0L, 5L, 2L, 4L, 17L, 4L, 12L, 1L, 7L, 0L, 1L, 0L, 0L, 0L, 0L,
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Species8 = c(2L, 1L, 0L,
> 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L,
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L), Species9 = c(0L,
> 1L, 0L, 0L, 1L, 0L, 0L, 2L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L,
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Species10 = c(1L,
> 0L, 0L, 19L, 1L, 0L, 11L, 0

Re: [R-sig-eco] How to calculate relative abundance along taxonomical hierarchy

2015-06-25 Thread Peter Solymos
Gian,

Once you have your samples by OTU matrix row standardized, you can use a
level of your hierarchy (a vector matching the columns) and the
groupSums(your-matrix, 2, your-groups) function in the mefa4 package to get
your relative abundances.

Cheers,

Peter

Gian Maria Niccolò Benucci  ezt írta (2015. június
25., csütörtök):

> Hello everyone,
> I am working on a fungal dataset with 151 OTUs distributed in 20 samples. I
> have imported it as phyloseq object and as normal species matrix as well to
> work with the vegan package.
> I am trying to find a way to get relative abundances at different
> hierarchical level goruping the abundances of the OTUs present in my
> dataset. For example, if I want to know what is the relative abundance of
> each Phylum (or of each Family, or each Genus) how can I do? There is a way
> to do that inside R?
> Thank you very much in advance,
>
>
> --
> ​Gian​
>
>
>
> *- Do not print this email unless you really need to. Save paper and
> protect the environment! -*
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org 
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



-- 
--
Péter Sólymos
780-492-8534 | soly...@ualberta.ca | peter.solymos.org
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] How to calculate relative abundance along taxonomical hierarchy

2015-06-25 Thread Peter Solymos
The tax_table accessor function will give you the taxonomic ranks in a
table according to this website:
https://web.stanford.edu/class/bios221/labs/phyloseq/lab_phyloseq.html
 otu_table will give you the data matrix with OTUs as *rows*. So you need
to transpose or adjust the code I suggested before.

Peter

Gian Maria Niccolò Benucci  ezt írta (2015. június
25., csütörtök):

> Hello,
> thank you for the replies. My dataset is something like this:
>
> > datafungi
>  BR1  BR2  BR3  BR4  BR5   F1   F2   F3   F4   F5   R1   R2   R3
> R4   R5   W1   W2   W3   W4   W5
> OTU_1   0.08 0.13 0.56 0.90 0.91 0.36 0.05 0.02 0.13 0.11 0.00 0.00 0.15
> 0.06 0.00 0.12 0.00 0.00 0.03 0.05
> OTU_35  0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.01 0.00 0.01 0.06
> OTU_4   0.01 0.00 0.00 0.00 0.00 0.00 0.01 0.31 0.08 0.00 0.07 0.05 0.00
> 0.00 0.00 0.00 0.00 0.07 0.41 0.00
> OTU_3   0.00 0.00 0.00 ...
>
> So I do not have the taxonomic hierarchy inside the normal dataset, but
> only the OTUs.
>
> Anyway, have taxonomic ranks embedded it in the phyloseq object:
>
> > fungi
> phyloseq-class experiment-level object
> otu_table()   OTU Table: [ 150 taxa and 20 samples ]
> sample_data() Sample Data:   [ 20 samples by 5 sample variables ]
> tax_table()   Taxonomy Table:[ 150 taxa by 7 taxonomic ranks ]
>
> That's I wondered why to do it using this kind of data. I think that many
> other researcher that work with huge species dataset (this one is not very
> big but I worked with thousands of OTUs) will have this problem.
> Do you have an idea how to deal with this kind of data object?
> Thank you,
> Gian
>
>
>
>
>
> On 25 June 2015 at 11:51, Peter Solymos  > wrote:
>
> > Gian,
> >
> > Once you have your samples by OTU matrix row standardized, you can use a
> > level of your hierarchy (a vector matching the columns) and the
> > groupSums(your-matrix, 2, your-groups) function in the mefa4 package to
> get
> > your relative abundances.
> >
> > Cheers,
> >
> > Peter
> >
> > Gian Maria Niccolò Benucci > ezt
> írta (2015.
> > június 25., csütörtök):
> >
> >> Hello everyone,
> >> I am working on a fungal dataset with 151 OTUs distributed in 20
> samples.
> >> I
> >> have imported it as phyloseq object and as normal species matrix as well
> >> to
> >> work with the vegan package.
> >> I am trying to find a way to get relative abundances at different
> >> hierarchical level goruping the abundances of the OTUs present in my
> >> dataset. For example, if I want to know what is the relative abundance
> of
> >> each Phylum (or of each Family, or each Genus) how can I do? There is a
> >> way
> >> to do that inside R?
> >> Thank you very much in advance,
> >>
> >>
> >> --
> >> ​Gian​
> >>
> >>
> >>
> >> *- Do not print this email unless you really need to. Save paper and
> >> protect the environment! -*
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> ___
> >> R-sig-ecology mailing list
> >> R-sig-ecology@r-project.org 
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> >
> >
> >
> > --
> > --
> > Péter Sólymos
> > 780-492-8534 | soly...@ualberta.ca  | peter.solymos.org
> > Alberta Biodiversity Monitoring Institute http://www.abmi.ca
> > Boreal Avian Modelling Project http://www.borealbirds.ca
> >
> >
>
>
> --
>
>
> *- Do not print this email unless you really need to. Save paper and
> protect the environment! -*
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org 
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



-- 
--
Péter Sólymos
780-492-8534 | soly...@ualberta.ca | peter.solymos.org
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] Package to analyse population time series (trend analysis)

2016-10-19 Thread Peter Solymos
Manuel,

There are few ecology focused packages besides ARIMA() and the forecast
package, for example:

- popbio (based mostly on the Matrix Population Models by Caswell (2001)
and Quantitative Conservation Biology by Morris and Doak (2002).

- PVAClone, that uses JAGS and is based on Nadeem, K., Lele S. R., 2012.
Likelihood based population viability analysis in the presence of
observation error. Oikos 121, 1656–1664.

I am sure there are other packages as well that I am not aware of...

Cheers,

Peter

--
Péter Sólymos
780-492-8534 | soly...@ualberta.ca | peter.solymos.org
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca

On Wed, Oct 19, 2016 at 6:47 AM, Manuel Spínola 
wrote:

> Dear list members,
>
> What is the appropriate package to analyze population time series (trend
> analysis) when you have one count per year.
>
> Best,
>
> Manuel
>
> --
> *Manuel Spínola, Ph.D.*
> Instituto Internacional en Conservación y Manejo de Vida Silvestre
> Universidad Nacional
> Apartado 1350-3000
> Heredia
> COSTA RICA
> mspin...@una.cr 
> mspinol...@gmail.com
> Teléfono: (506) 8706 - 4662
> Personal website: Lobito de río  site/lobitoderio/>
> Institutional website: ICOMVIS 
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] LEVELS function NULL

2020-06-23 Thread Peter Solymos
R >= 4.0 will treat categorical variables as character
(stringsAsFactors=FALSE is the new default) when importing via e.g.
read.table(). If you want factors, you have to make it explicit either as
Torsten showed or by setting stringsAsFactors=TRUE.

Cheers,

Peter

On Tue, Jun 23, 2020 at 10:05 AM Torsten Hauffe 
wrote:

> Hi,
>
> What is the output of
> class(my_data$Sampling)?
>
> If it is not factor (but character), you could write
> my_data$Sampling <- as.factor(my_data$Sampling)
>
> Then levels(my_data$Sampling) should return Sep18 etc.
>
> Cheers!
>
> On Tue, Jun 23, 2020 at 6:00 PM Vesna Gulin 
> wrote:
>
> > Hi everyone,
> >
> >
> >
> > It has been a couple months now of using R and I am slowly trying to
> switch
> > from using Statistica to R in order to process my data (for PhD
> purposes).
> >
> >
> >
> > I did some basic statistics and now I would like to try Kruskal-Wallis
> > (following this guideline:
> >
> > http://www.sthda.com/english/wiki/kruskal-wallis-test-in-r )
> >
> > but I have been stuck for days as I can't seem to find a reason why the
> > levels function keeps giving me NULL.
> >
> >
> >
> > My data is a data frame:
> >
> >
> >
> > my_data
> >
> >Temperature Sampling
> >
> > 1 9.90Dec17
> >
> > 210.00Dec17
> >
> > 310.00Dec17
> >
> > 410.10Dec17
> >
> > 5 9.90Dec17
> >
> > 610.10Dec17
> >
> > 710.20Dec17
> >
> > 810.10Jan18
> >
> > 910.50Jan18
> >
> > 10   10.80Jan18
> >
> > 11   10.30Jan18
> >
> > 12   10.30Jan18
> >
> > 13   10.20Jan18
> >
> > 14   10.50Jan18
> >
> > 15   20.90Jun18
> >
> > 16   20.80Jun18
> >
> > 17   20.50Jun18
> >
> > 18   20.50Jun18
> >
> > 19   20.70Jun18
> >
> > 20   20.60Jun18
> >
> > 21   20.50Jun18
> >
> > 22   10.50Mar18
> >
> > 23   11.40Mar18
> >
> > 24   10.90Mar18
> >
> > 25   10.60Mar18
> >
> > 26   11.60Mar18
> >
> > 27   11.20Mar18
> >
> > 28   12.20Mar18
> >
> > 29   19.40May18
> >
> > 30   19.00May18
> >
> > 31   20.80May18
> >
> > 32   19.50May18
> >
> > 33   18.60May18
> >
> > 34   18.10May18
> >
> > 35   19.20May18
> >
> > 36   16.80Nov18
> >
> > 37   16.50Nov18
> >
> > 38   18.40Nov18
> >
> > 39   18.10Nov18
> >
> > 40   16.70Nov18
> >
> > 41   16.70Nov18
> >
> > 42   18.50Nov18
> >
> > 43   15.50Oct17
> >
> > 44   15.10Oct17
> >
> > 45   15.00Oct17
> >
> > 46   15.80Oct17
> >
> > 47   15.20Oct17
> >
> > 48   15.10Oct17
> >
> > 49   15.20Oct17
> >
> > 50   23.40Sep18
> >
> > 51   24.05Sep18
> >
> > 52   23.70Sep18
> >
> > 53   23.40Sep18
> >
> > 54   24.05Sep18
> >
> > 55   23.80Sep18
> >
> > 56   24.15Sep18
> >
> >
> >
> > When I hit levels(my_data$Sampling) it gives me
> > NULL
> >
> >
> >
> > I would very much appreciate any help.
> >
> >
> >
> > Kind regards,
> >
> > Vesna
> >
> >
> >
> >
> >
> >
> >
> > Vesna Gulin,  Research Assistant
> >
> >
> >
> > Department of Biology
> >
> >
> >
> > Faculty of Science
> >
> > University of Zagreb
> >
> >
> >
> > Rooseveltov trg 6, 1 Zagreb, Croatia
> >
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-ecology mailing list
> > R-sig-ecology@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> >
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>


-- 
Péter Sólymos, PhD
Senior Statistical Ecologist
soly...@ualberta.ca 780-492-8534
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] DCCA in R?

2020-11-05 Thread Peter Solymos
Jonathan,

Have you checked ?vegan::decorana (it is also mentioned in the vignette on
p 2: https://cran.r-project.org/web/packages/vegan/vignettes/intro-vegan.pdf
)

Cheers,

Peter

On Thu, Nov 5, 2020 at 11:03 AM Jonathan Gordon 
wrote:

> Hello,
>
> I’m aiming to perform a detrended canonical correspondence analysis on a
> pollen sequence, to create beta diversity curves for pollen sequences
> through time. An example of this being done is in Felde (2019) <
> https://link.springer.com/article/10.1007/s00334-019-00726-5>. As far as
> I can tell, DCCA can’t be done in R packages such as vegan, and can only be
> completed in Canoco. I’m keen not to use Canoco because it's not open
> access and the processes behind the interface are obscured, making for a
> non-repeatable methodology. Please could you advise any way for me to
> perform a DCCA to obtain my desired results, or of a different method?
>
> Best wishes,
> Jonny
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>


-- 
Péter Sólymos, PhD
Senior Statistical Ecologist
soly...@ualberta.ca 780-492-8534
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] DCCA in R?

2020-11-11 Thread Peter Solymos
You are right, Attila, I missed the canonical part. It is either detrended
& unconstrained (decorana) or plain & constrained/canonical (cca).

Cheers,

Peter

On Wed, Nov 11, 2020 at 3:27 PM Attila Lengyel  wrote:

> Dear Peter,
>
> as far as I know, vegan::decorana does not do DCCA (detrended canonical
> correspondence analysis) but the unconstrained DCA. If anyone knows the
> answer for Jonathan's question, please, share it with me, I would also be
> interested.
> Best regards,
>
> Attila
>
> Peter Solymos  ezt írta (időpont: 2020. nov. 5., Cs,
> 19:19):
>
>> Jonathan,
>>
>> Have you checked ?vegan::decorana (it is also mentioned in the vignette on
>> p 2:
>> https://cran.r-project.org/web/packages/vegan/vignettes/intro-vegan.pdf
>> )
>>
>> Cheers,
>>
>> Peter
>>
>> On Thu, Nov 5, 2020 at 11:03 AM Jonathan Gordon <
>> jonathangordon...@gmail.com>
>> wrote:
>>
>> > Hello,
>> >
>> > I’m aiming to perform a detrended canonical correspondence analysis on a
>> > pollen sequence, to create beta diversity curves for pollen sequences
>> > through time. An example of this being done is in Felde (2019) <
>> > https://link.springer.com/article/10.1007/s00334-019-00726-5>. As far
>> as
>> > I can tell, DCCA can’t be done in R packages such as vegan, and can
>> only be
>> > completed in Canoco. I’m keen not to use Canoco because it's not open
>> > access and the processes behind the interface are obscured, making for a
>> > non-repeatable methodology. Please could you advise any way for me to
>> > perform a DCCA to obtain my desired results, or of a different method?
>> >
>> > Best wishes,
>> > Jonny
>> > [[alternative HTML version deleted]]
>> >
>> > ___
>> > R-sig-ecology mailing list
>> > R-sig-ecology@r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>> >
>>
>>
>> --
>> Péter Sólymos, PhD
>> Senior Statistical Ecologist
>> soly...@ualberta.ca 780-492-8534
>> Alberta Biodiversity Monitoring Institute http://www.abmi.ca
>> Boreal Avian Modelling Project http://www.borealbirds.ca
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>

-- 
Péter Sólymos, PhD
Senior Statistical Ecologist
soly...@ualberta.ca 780-492-8534
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology