Re: [R-sig-eco] Calculate AIC, DIC and BIC for models machine learning

2019-03-22 Thread Jane Elith
Hi Lara, Sarah
There is a problem in estimation of degrees of freedom so even though some 
people try to estimate AIC eg for Maxent models, it's not correct (the number 
of parameters in the model is not the effective degrees of freedom). Same for 
GBM, RF etc.
e.g. see Hauenstein, S., Wood, S.N. & Dormann, C.F. (2018) Computing AIC for 
black-box models using generalized degrees of freedom: A comparison with 
cross-validation. Communications in Statistics - Simulation and Computation, 
47, 1382–1396. 

Jane Elith

On 22/3/19, 10:05 pm, "R-sig-ecology on behalf of 
r-sig-ecology-requ...@r-project.org"  wrote:


Message: 2
Date: Thu, 21 Mar 2019 09:55:03 -0400
From: Sarah Goslee 
To: Lara Silva 
Cc: "r-sig-ecology@r-project.org" 
Subject: Re: [R-sig-eco]  Calculate AIC, DIC and BIC for models
machine learning
Message-ID:

Content-Type: text/plain; charset="utf-8"

Yes, of course it is.

Many of the machine learning packages in R offer AIC as part of their
model summaries.

You should probably spend some time with the machine learning taskview
to discover more about R's extensive capabilities.

https://cran.r-project.org/web/views/MachineLearning.html

Sarah

On Thu, Mar 21, 2019 at 9:40 AM Lara Silva  wrote:
>
> Hello everyone!
>
> In R, it is possible to calculate AIC, DIC, or BIC  for models machine
> learning, like RF, ANN, GBM, MARS?
>
> Are there any functions or specific packages in R?
>
> Any suggestion?
>
> Thanks
>
> Lara
>
> 

> Sem
> vírus. www.avast.com
> 

> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> [[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Fwd: [EXTERNAL] Which R package for Second-Stage nMDS ?

2019-03-22 Thread Philippi, Tom
Pierre--
That's actually an easy question.  Look at the vegan package
https://cran.r-project.org/web/packages/vegan/index.html.  It has functions
for various dissimilarity & distance metrics (and can use distance matrices
from other packages such as ecodist), MDS and allied processes, anosim and
restricted mantel tests, and many many other approaches.

More broadly, the cran task view for environmetrics is a great place to
start for finding packages directed at ecological questions and analyses.
https://cran.r-project.org/web/views/Environmetrics.html

Tom


On Fri, Mar 22, 2019 at 3:43 AM Pierre THIRIET 
wrote:

> Dear useRs,
>
> I want to perform 2nd stage nMDS, as described in Clarke, K.R., et al
> (2006). Exploring interactions by second-stage community analyses.
> Journal of Experimental Marine Biology and Ecology 338, 179-192. See
> Abstract below
>
> Do you know a package in R for that ? Or would you have home-made
> scripts, at least a function for computing the distance matrix of
> pair-wise correlations among dissimilarity matrices ?
>
> Thank you,
>
> Pierre
>
>
> Abstract of Clarke et al 2006 :
>
> Many biological data sets, from field observations and manipulative
> experiments, involve crossed factor designs, analysed in a univariate
> context by higher-way analyses of variance which partition out ‘main’
> and ‘interaction’ effects. Indeed, tests for significance of
> interactions among factors, such as differing Before–After responses at
> Control and Impact sites, are the basis of the widely used BACI strategy
> for detecting impacts in the environment. There are difficulties,
> however, in generalising simple univariate definitions of interaction,
> from classic linear models, to the robust, non-parametric multivariate
> methods that are commonly required in handling assemblage data. The size
> of an interaction term, and even its existence at all, depends crucially
> on the measurement scale, so it is fundamentally a parametric construct.
> Despite this, certain forms of interaction can be examined using
> non-parametric methods, namely those evidenced by changing assemblage
> patterns over many time periods, for replicate sites from different
> experimental conditions (types of ‘Beyond BACI’ design) – or changing
> multivariate structure over space, at many observed times. *Second-stage
> MDS, which can be thought of as an MDS plot of the pairwise similarities
> between MDS plots (e.g. of assemblage time trajectories), can be used to
> illustrate such interactions, and they can be formally tested by
> second-stage ANOSIM permutation tests. Similarities between
> (first-stage) multivariate patterns are assessed by rank-based matrix
> correlations, preserving the fully non-parametric approach common in
> marine community studies. *The method is exemplified using time-series
> data on corals from Thailand, macrobenthos from Tees Bay, UK, and
> macroalgae from a complex recolonisation experiment carried out in the
> Ligurian Sea, Italy. The latter data set is also used to demonstrate how
> the analysis copes straightforwardly with certain repeated-measures
> designs.
>
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] R-sig-ecology Digest, Vol 132, Issue 10

2019-03-22 Thread Lara Silva
Prof Dr. Ralf Schäfer

thank you for sharing the links.

Best regards,

Lara Silva


Sem
vírus. www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Ralf Schäfer  escreveu no dia sexta,
22/03/2019 à(s) 11:30:

> Hi Lara,
>
> I am actually not sure that this is the best way to proceed.
> Cross-validation seems the method of choice and depending on your purpose
> you can compare the prediction error between models.
> See: Hauenstein S., Wood S.N. & Dormann C.F. (2018). Computing AIC for
> black-box models using generalized degrees of freedom: A comparison with
> cross-validation. *Communications in Statistics - Simulation and
> Computation* *47*, 1382–1396.
> https://doi.org/10.1080/03610918.2017.1315728
>
> However, these authors provide code to derive an AIC for different machine
> learning approaches. https://github.com/biometry/GDF
>
> Hope this helps and have a nice weekend.
>
> Best regards,
>
> Ralf Schäfer
>
> 
>
> Prof. Dr. Ralf Bernhard Schäfer
> Professor for Quantitative Landscape Ecology
> Environmental Scientist (M.Sc.)
> Institute for Environmental Sciences
> University Koblenz-Landau
> Fortstrasse 7
> 76829 Landau
> Germany
> Mail: schaefer-r...@uni-landau.de
> Phone: ++49 (0) 6341 280-31536
> Web: www.landscapecology.uni-landau.de
>
> Am 22.03.2019 um 12:00 schrieb r-sig-ecology-requ...@r-project.org:
>
>
> Message: 1
> Date: Thu, 21 Mar 2019 12:40:54 -0100
> From: Lara Silva 
> To: r-sig-ecology@r-project.org
> Subject: [R-sig-eco] Calculate AIC, DIC and BIC for models machine
> learning
> Message-ID:
> 
> Content-Type: text/plain; charset="utf-8"
>
> Hello everyone!
>
> In R, it is possible to calculate AIC, DIC, or BIC  for models machine
> learning, like RF, ANN, GBM, MARS?
>
> Are there any functions or specific packages in R?
>
> Any suggestion?
>
> Thanks
>
> Lara
>
>
>

Sem
vírus. www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Which R package for Second-Stage nMDS ?

2019-03-22 Thread Pierre THIRIET

Thanks a lot dear Philip. I will try this.

all the best

Pierre


Le 22/03/2019 à 17:07, Dixon, Philip M [STAT] a écrit :

Pierre,

I don't know a function that does this, but it is extremely easy to code.

Dist objects are vectors containing the 1st stage pairwise dissimilarities.   Call 
those dist1, dist2, dist3, ...  So alldist <- cbind(dist1=dist1, dist2=dist2, ...) 
will assemble the matrix of dissimilarities with useful column names.  stage2 <- 
as.dist(1-cor(alldist)) will compute the matrix of correlations, convert from 
similarity (the correlation) to distance (1-correlation) and convert to a distance 
object.  Then just run your favorite MDS on stage2.

Note: sometimes folks prefer sqrt(1-cor) as the "correlation distance", instead 
of 1-cor.  I don't know which Clarke prefers.

Best,
Philip Dixon

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Which R package for Second-Stage nMDS ?

2019-03-22 Thread Dixon, Philip M [STAT]
Pierre,

I don't know a function that does this, but it is extremely easy to code.

Dist objects are vectors containing the 1st stage pairwise dissimilarities.   
Call those dist1, dist2, dist3, ...  So alldist <- cbind(dist1=dist1, 
dist2=dist2, ...) will assemble the matrix of dissimilarities with useful 
column names.  stage2 <- as.dist(1-cor(alldist)) will compute the matrix of 
correlations, convert from similarity (the correlation) to distance 
(1-correlation) and convert to a distance object.  Then just run your favorite 
MDS on stage2.  

Note: sometimes folks prefer sqrt(1-cor) as the "correlation distance", instead 
of 1-cor.  I don't know which Clarke prefers.

Best,
Philip Dixon

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] R-sig-ecology Digest, Vol 132, Issue 10

2019-03-22 Thread Ralf Schäfer
Hi Lara,

I am actually not sure that this is the best way to proceed.
Cross-validation seems the method of choice and depending on your purpose you 
can compare the prediction error between models.
See: Hauenstein S., Wood S.N. & Dormann C.F. (2018). Computing AIC for 
black-box models using generalized degrees of freedom: A comparison with 
cross-validation. Communications in Statistics - Simulation and Computation 47, 
1382–1396. https://doi.org/10.1080/03610918.2017.1315728 


However, these authors provide code to derive an AIC for different machine 
learning approaches. https://github.com/biometry/GDF

Hope this helps and have a nice weekend.

Best regards,

Ralf Schäfer



Prof. Dr. Ralf Bernhard Schäfer
Professor for Quantitative Landscape Ecology
Environmental Scientist (M.Sc.)
Institute for Environmental Sciences
University Koblenz-Landau
Fortstrasse 7
76829 Landau
Germany
Mail: schaefer-r...@uni-landau.de
Phone: ++49 (0) 6341 280-31536
Web: www.landscapecology.uni-landau.de

> Am 22.03.2019 um 12:00 schrieb r-sig-ecology-requ...@r-project.org:
> 
> 
> Message: 1
> Date: Thu, 21 Mar 2019 12:40:54 -0100
> From: Lara Silva mailto:lara.sfp.si...@gmail.com>>
> To: r-sig-ecology@r-project.org 
> Subject: [R-sig-eco] Calculate AIC, DIC and BIC for models machine
>   learning
> Message-ID:
>>
> Content-Type: text/plain; charset="utf-8"
> 
> Hello everyone!
> 
> In R, it is possible to calculate AIC, DIC, or BIC  for models machine
> learning, like RF, ANN, GBM, MARS?
> 
> Are there any functions or specific packages in R?
> 
> Any suggestion?
> 
> Thanks
> 
> Lara


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Which R package for Second-Stage nMDS ?

2019-03-22 Thread Pierre THIRIET
Dear useRs,

I want to perform 2nd stage nMDS, as described in Clarke, K.R., et al 
(2006). Exploring interactions by second-stage community analyses. 
Journal of Experimental Marine Biology and Ecology 338, 179-192. See 
Abstract below

Do you know a package in R for that ? Or would you have home-made 
scripts, at least a function for computing the distance matrix of 
pair-wise correlations among dissimilarity matrices ?

Thank you,

Pierre


Abstract of Clarke et al 2006 :

Many biological data sets, from field observations and manipulative 
experiments, involve crossed factor designs, analysed in a univariate 
context by higher-way analyses of variance which partition out ‘main’ 
and ‘interaction’ effects. Indeed, tests for significance of 
interactions among factors, such as differing Before–After responses at 
Control and Impact sites, are the basis of the widely used BACI strategy 
for detecting impacts in the environment. There are difficulties, 
however, in generalising simple univariate definitions of interaction, 
from classic linear models, to the robust, non-parametric multivariate 
methods that are commonly required in handling assemblage data. The size 
of an interaction term, and even its existence at all, depends crucially 
on the measurement scale, so it is fundamentally a parametric construct. 
Despite this, certain forms of interaction can be examined using 
non-parametric methods, namely those evidenced by changing assemblage 
patterns over many time periods, for replicate sites from different 
experimental conditions (types of ‘Beyond BACI’ design) – or changing 
multivariate structure over space, at many observed times. *Second-stage 
MDS, which can be thought of as an MDS plot of the pairwise similarities 
between MDS plots (e.g. of assemblage time trajectories), can be used to 
illustrate such interactions, and they can be formally tested by 
second-stage ANOSIM permutation tests. Similarities between 
(first-stage) multivariate patterns are assessed by rank-based matrix 
correlations, preserving the fully non-parametric approach common in 
marine community studies. *The method is exemplified using time-series 
data on corals from Thailand, macrobenthos from Tees Bay, UK, and 
macroalgae from a complex recolonisation experiment carried out in the 
Ligurian Sea, Italy. The latter data set is also used to demonstrate how 
the analysis copes straightforwardly with certain repeated-measures designs.


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology