Re: [R] AIC function and Step function

2008-12-01 Thread Dana77

Thank you, Kingsford. 

Then I am wondering if there are other ways to write R codes to calculate
the weights ? Thanks!


Dana



Kingsford Jones wrote:
 
 On Sun, Nov 30, 2008 at 5:05 PM, Dana77 [EMAIL PROTECTED] wrote:

  Thanks for kind help from Steven and Christos last time.  Now I got new
 problem regarding the codes for calculating the weights (w) in AIC ()
 function.
 The original code is as below:
   getAnywhere(logLik.lm)
 function (object, REML = FALSE, ...)
  {
res - object$residuals
p - object$rank
N - length(res)
if (is.null(w - object$weights)) {
w - rep.int(1, N)
}else {
excl - w == 0
if (any(excl)) {
res - res[!excl]
N - length(res)
w - w[!excl]
}
}

  Now my question is, if I use lm() function to fit a multiple linear
 regression model, such as mod.fit-lm(formula = Y~ X1 + X2 + X3, data =
 set1), what code could I use to extract the weights (w) out? or how to
 calculate the weights(w) shown in above codes?
 
 
 mod.fit won't have weights because you didn't specify any through the
 weights argument to lm.  If you had, you could extract them using the
 same technique used in the above code:  w - mod.fit$weights
 
 hth,
 
 Kingsford Jones
 
 
 
 
 
 
 Thanks for your time and kind
 help!

 Dana



 Steven McKinney wrote:

 Hi Dana,

 Many thanks to Christos Hatzis who sent
 me an offline response, pointing out the
 new functions that make this much
 easier than my last suggestions:
 methods() and getAnywhere()

 methods(extractAIC)
 [1] extractAIC.aov* extractAIC.coxph*   extractAIC.glm*
 extractAIC.lm*  extractAIC.negbin*
 [6] extractAIC.survreg*

Non-visible functions are asterisked
 getAnywhere(extractAIC.coxph)
 A single object matching 'extractAIC.coxph' was found
 It was found in the following places
   registered S3 method for extractAIC from namespace stats
   namespace:stats
 with value

 function (fit, scale, k = 2, ...)
 {
 edf - length(fit$coef)
 loglik - fit$loglik[length(fit$loglik)]
 c(edf, -2 * loglik + k * edf)
 }
 environment: namespace:stats


 Thank you Christos.


 That said, one of the advantages of getting
 the source code is that it has comments that
 are stripped out when the code is sourced into R

 e.g. from the command line

 getAnywhere(AIC.default)
 A single object matching 'AIC.default' was found
 It was found in the following places
   registered S3 method for AIC from namespace stats
   namespace:stats
 with value

 function (object, ..., k = 2)
 {
 ll - if (stats4 %in% loadedNamespaces())
 stats4:::logLik
 else logLik
 if (length(list(...))) {
 object - list(object, ...)
 val - lapply(object, ll)
 val - as.data.frame(t(sapply(val, function(el) c(attr(el,
 df), AIC(el, k = k)
 names(val) - c(df, AIC)
 Call - match.call()
 Call$k - NULL
 row.names(val) - as.character(Call[-1])
 val
 }
 else AIC(ll(object), k = k)
 }
 environment: namespace:stats

From the source file


 #  File src/library/stats/R/AIC.R
 #  Part of the R package, http://www.R-project.org
 #
 #  This program is free software; you can redistribute it and/or modify
 #  it under the terms of the GNU General Public License as published by
 #  the Free Software Foundation; either version 2 of the License, or
 #  (at your option) any later version.
 #
 #  This program is distributed in the hope that it will be useful,
 #  but WITHOUT ANY WARRANTY; without even the implied warranty of
 #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 #  GNU General Public License for more details.
 #
 #  A copy of the GNU General Public License is available at
 #  http://www.r-project.org/Licenses/

  Return the object's value of the Akaike Information Criterion
  (or An Inf.. Crit..)

 AIC - function(object, ..., k = 2) UseMethod(AIC)

 ## AIC for logLik objects
 AIC.logLik - function(object, ..., k = 2)
 -2 * c(object) + k * attr(object, df)

 AIC.default - function(object, ..., k = 2)
 {
 ## AIC for various fitted objects --- any for which there's a
 logLik()
 method:
 ll - if(stats4 %in% loadedNamespaces()) stats4:::logLik else
 logLik
 if(length(list(...))) {# several objects: produce data.frame
   object - list(object, ...)
   val - lapply(object, ll)
   val - as.data.frame(t(sapply(val,
 function(el)
 c(attr(el, df), AIC(el, k = k)
   names(val) - c(df, AIC)
 Call - match.call()
 Call$k - NULL
   row.names(val) - as.character(Call[-1])
   val
 } else AIC(ll(object), k = k)
 }



 Steven McKinney

 Statistician
 Molecular Oncology and Breast Cancer Program
 British Columbia Cancer Research Centre

 email: smckinney +at+ bccrc +dot+ ca

 tel: 604-675-8000 x7561

 BCCRC
 Molecular Oncology
 675 West 10th Ave, Floor 4
 Vancouver 

Re: [R] AIC function and Step function

2008-11-30 Thread Dana77

 Thanks for kind help from Steven and Christos last time.  Now I got new
problem regarding the codes for calculating the weights (w) in AIC ()
function. 
The original code is as below:
  getAnywhere(logLik.lm)
function (object, REML = FALSE, ...) 
  {
res - object$residuals
p - object$rank
N - length(res)
if (is.null(w - object$weights)) {
w - rep.int(1, N)
}else {
excl - w == 0
if (any(excl)) {
res - res[!excl]
N - length(res)
w - w[!excl]
}
}

 Now my question is, if I use lm() function to fit a multiple linear
regression model, such as mod.fit-lm(formula = Y~ X1 + X2 + X3, data =
set1), what code could I use to extract the weights (w) out? or how to
calculate the weights(w) shown in above codes? Thanks for your time and kind
help!

Dana



Steven McKinney wrote:
 
 Hi Dana,
 
 Many thanks to Christos Hatzis who sent
 me an offline response, pointing out the
 new functions that make this much
 easier than my last suggestions:
 methods() and getAnywhere()
 
 methods(extractAIC)
 [1] extractAIC.aov* extractAIC.coxph*   extractAIC.glm*
 extractAIC.lm*  extractAIC.negbin* 
 [6] extractAIC.survreg*
 
Non-visible functions are asterisked
 getAnywhere(extractAIC.coxph)
 A single object matching ‘extractAIC.coxph’ was found
 It was found in the following places
   registered S3 method for extractAIC from namespace stats
   namespace:stats
 with value
 
 function (fit, scale, k = 2, ...) 
 {
 edf - length(fit$coef)
 loglik - fit$loglik[length(fit$loglik)]
 c(edf, -2 * loglik + k * edf)
 }
 environment: namespace:stats
 
 
 Thank you Christos.
 
 
 That said, one of the advantages of getting
 the source code is that it has comments that
 are stripped out when the code is sourced into R
 
 e.g. from the command line
 
 getAnywhere(AIC.default)
 A single object matching ‘AIC.default’ was found
 It was found in the following places
   registered S3 method for AIC from namespace stats
   namespace:stats
 with value
 
 function (object, ..., k = 2) 
 {
 ll - if (stats4 %in% loadedNamespaces()) 
 stats4:::logLik
 else logLik
 if (length(list(...))) {
 object - list(object, ...)
 val - lapply(object, ll)
 val - as.data.frame(t(sapply(val, function(el) c(attr(el, 
 df), AIC(el, k = k)
 names(val) - c(df, AIC)
 Call - match.call()
 Call$k - NULL
 row.names(val) - as.character(Call[-1])
 val
 }
 else AIC(ll(object), k = k)
 }
 environment: namespace:stats
 
From the source file
 
 
 #  File src/library/stats/R/AIC.R
 #  Part of the R package, http://www.R-project.org
 #
 #  This program is free software; you can redistribute it and/or modify
 #  it under the terms of the GNU General Public License as published by
 #  the Free Software Foundation; either version 2 of the License, or
 #  (at your option) any later version.
 #
 #  This program is distributed in the hope that it will be useful,
 #  but WITHOUT ANY WARRANTY; without even the implied warranty of
 #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 #  GNU General Public License for more details.
 #
 #  A copy of the GNU General Public License is available at
 #  http://www.r-project.org/Licenses/
 
  Return the object's value of the Akaike Information Criterion
  (or An Inf.. Crit..)
 
 AIC - function(object, ..., k = 2) UseMethod(AIC)
 
 ## AIC for logLik objects
 AIC.logLik - function(object, ..., k = 2)
 -2 * c(object) + k * attr(object, df)
 
 AIC.default - function(object, ..., k = 2)
 {
 ## AIC for various fitted objects --- any for which there's a logLik()
 method:
 ll - if(stats4 %in% loadedNamespaces()) stats4:::logLik else logLik
 if(length(list(...))) {# several objects: produce data.frame
   object - list(object, ...)
   val - lapply(object, ll)
   val - as.data.frame(t(sapply(val,
 function(el)
 c(attr(el, df), AIC(el, k = k)
   names(val) - c(df, AIC)
 Call - match.call()
 Call$k - NULL
   row.names(val) - as.character(Call[-1])
   val
 } else AIC(ll(object), k = k)
 }
 
 
 
 Steven McKinney
 
 Statistician
 Molecular Oncology and Breast Cancer Program
 British Columbia Cancer Research Centre
 
 email: smckinney +at+ bccrc +dot+ ca
 
 tel: 604-675-8000 x7561
 
 BCCRC
 Molecular Oncology
 675 West 10th Ave, Floor 4
 Vancouver B.C. 
 V5Z 1L3
 Canada
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/AIC-function-and-Step-function-tp20728043p20764062.html
Sent from the R help mailing 

Re: [R] AIC function and Step function

2008-11-30 Thread Kingsford Jones
On Sun, Nov 30, 2008 at 5:05 PM, Dana77 [EMAIL PROTECTED] wrote:

  Thanks for kind help from Steven and Christos last time.  Now I got new
 problem regarding the codes for calculating the weights (w) in AIC ()
 function.
 The original code is as below:
   getAnywhere(logLik.lm)
 function (object, REML = FALSE, ...)
  {
res - object$residuals
p - object$rank
N - length(res)
if (is.null(w - object$weights)) {
w - rep.int(1, N)
}else {
excl - w == 0
if (any(excl)) {
res - res[!excl]
N - length(res)
w - w[!excl]
}
}

  Now my question is, if I use lm() function to fit a multiple linear
 regression model, such as mod.fit-lm(formula = Y~ X1 + X2 + X3, data =
 set1), what code could I use to extract the weights (w) out? or how to
 calculate the weights(w) shown in above codes?


mod.fit won't have weights because you didn't specify any through the
weights argument to lm.  If you had, you could extract them using the
same technique used in the above code:  w - mod.fit$weights

hth,

Kingsford Jones






 Thanks for your time and kind
 help!

 Dana



 Steven McKinney wrote:

 Hi Dana,

 Many thanks to Christos Hatzis who sent
 me an offline response, pointing out the
 new functions that make this much
 easier than my last suggestions:
 methods() and getAnywhere()

 methods(extractAIC)
 [1] extractAIC.aov* extractAIC.coxph*   extractAIC.glm*
 extractAIC.lm*  extractAIC.negbin*
 [6] extractAIC.survreg*

Non-visible functions are asterisked
 getAnywhere(extractAIC.coxph)
 A single object matching 'extractAIC.coxph' was found
 It was found in the following places
   registered S3 method for extractAIC from namespace stats
   namespace:stats
 with value

 function (fit, scale, k = 2, ...)
 {
 edf - length(fit$coef)
 loglik - fit$loglik[length(fit$loglik)]
 c(edf, -2 * loglik + k * edf)
 }
 environment: namespace:stats


 Thank you Christos.


 That said, one of the advantages of getting
 the source code is that it has comments that
 are stripped out when the code is sourced into R

 e.g. from the command line

 getAnywhere(AIC.default)
 A single object matching 'AIC.default' was found
 It was found in the following places
   registered S3 method for AIC from namespace stats
   namespace:stats
 with value

 function (object, ..., k = 2)
 {
 ll - if (stats4 %in% loadedNamespaces())
 stats4:::logLik
 else logLik
 if (length(list(...))) {
 object - list(object, ...)
 val - lapply(object, ll)
 val - as.data.frame(t(sapply(val, function(el) c(attr(el,
 df), AIC(el, k = k)
 names(val) - c(df, AIC)
 Call - match.call()
 Call$k - NULL
 row.names(val) - as.character(Call[-1])
 val
 }
 else AIC(ll(object), k = k)
 }
 environment: namespace:stats

From the source file


 #  File src/library/stats/R/AIC.R
 #  Part of the R package, http://www.R-project.org
 #
 #  This program is free software; you can redistribute it and/or modify
 #  it under the terms of the GNU General Public License as published by
 #  the Free Software Foundation; either version 2 of the License, or
 #  (at your option) any later version.
 #
 #  This program is distributed in the hope that it will be useful,
 #  but WITHOUT ANY WARRANTY; without even the implied warranty of
 #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 #  GNU General Public License for more details.
 #
 #  A copy of the GNU General Public License is available at
 #  http://www.r-project.org/Licenses/

  Return the object's value of the Akaike Information Criterion
  (or An Inf.. Crit..)

 AIC - function(object, ..., k = 2) UseMethod(AIC)

 ## AIC for logLik objects
 AIC.logLik - function(object, ..., k = 2)
 -2 * c(object) + k * attr(object, df)

 AIC.default - function(object, ..., k = 2)
 {
 ## AIC for various fitted objects --- any for which there's a logLik()
 method:
 ll - if(stats4 %in% loadedNamespaces()) stats4:::logLik else logLik
 if(length(list(...))) {# several objects: produce data.frame
   object - list(object, ...)
   val - lapply(object, ll)
   val - as.data.frame(t(sapply(val,
 function(el)
 c(attr(el, df), AIC(el, k = k)
   names(val) - c(df, AIC)
 Call - match.call()
 Call$k - NULL
   row.names(val) - as.character(Call[-1])
   val
 } else AIC(ll(object), k = k)
 }



 Steven McKinney

 Statistician
 Molecular Oncology and Breast Cancer Program
 British Columbia Cancer Research Centre

 email: smckinney +at+ bccrc +dot+ ca

 tel: 604-675-8000 x7561

 BCCRC
 Molecular Oncology
 675 West 10th Ave, Floor 4
 Vancouver B.C.
 V5Z 1L3
 Canada

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the 

[R] AIC function and Step function

2008-11-27 Thread Dana77

I would like to figure out the equations for calculating AIC in both
 step() function and AIC () function. They are different. Then I  
just type step in the R console, and found the AIC used in step()  
function is extractAIC. I went to the R help, and found:

The criterion used is

AIC = - 2*log L + k * edf,
where L is the likelihood and edf the equivalent degrees of freedom (i.e.,
the number of free parameters for usual parametric models) of fit.

For linear models with unknown scale (i.e., for lm and aov), -2log L is
computed from the deviance and uses a different additive constant to logLik
and hence AIC. If RSS denotes the (weighted) residual sum of squares then
extractAIC uses for - 2log L the formulae RSS/s - n (corresponding to
Mallows' Cp) in the case of known scale s and n log (RSS/n) for unknown
scale. AIC only handles unknown scale and uses the formula n log (RSS/n) - n
+ n log 2π - sum log w where w are the weights.


Now, my question is what code I should use to look at the exact calculation
process in the AIC()function and extractAIC() function in R? Thanks!

Dana
-- 
View this message in context: 
http://www.nabble.com/AIC-function-and-Step-function-tp20728043p20728043.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] AIC function and Step function

2008-11-27 Thread Steven McKinney
, ...)
{
n - length(fit$residuals)
edf - n  - fit$df.residual
RSS - deviance.lm(fit)
dev - if(scale  0) RSS/scale - n else n * log(RSS/n)
c(edf, dev + k * edf)
}
extractAIC.aov - extractAIC.lm

extractAIC.negbin - function(fit, scale, k = 2, ...)
{
n - length(fit$residuals)
edf - n - fit$df.residual
c(edf, -fit$twologlik + k * edf)
}


HTH



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Dana77
Sent: Thu 11/27/2008 6:11 PM
To: r-help@r-project.org
Subject: [R]  AIC function and Step function
 

I would like to figure out the equations for calculating AIC in both
 step() function and AIC () function. They are different. Then I  
just type step in the R console, and found the AIC used in step()  
function is extractAIC. I went to the R help, and found:

The criterion used is

AIC = - 2*log L + k * edf,
where L is the likelihood and edf the equivalent degrees of freedom (i.e.,
the number of free parameters for usual parametric models) of fit.

For linear models with unknown scale (i.e., for lm and aov), -2log L is
computed from the deviance and uses a different additive constant to logLik
and hence AIC. If RSS denotes the (weighted) residual sum of squares then
extractAIC uses for - 2log L the formulae RSS/s - n (corresponding to
Mallows' Cp) in the case of known scale s and n log (RSS/n) for unknown
scale. AIC only handles unknown scale and uses the formula n log (RSS/n) - n
+ n log 2? - sum log w where w are the weights.


Now, my question is what code I should use to look at the exact calculation
process in the AIC()function and extractAIC() function in R? Thanks!

Dana
-- 
View this message in context: 
http://www.nabble.com/AIC-function-and-Step-function-tp20728043p20728043.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] AIC function and Step function

2008-11-27 Thread Steven McKinney
Hi Dana,

Many thanks to Christos Hatzis who sent
me an offline response, pointing out the
new functions that make this much
easier than my last suggestions:
methods() and getAnywhere()

 methods(extractAIC)
[1] extractAIC.aov* extractAIC.coxph*   extractAIC.glm* extractAIC.lm*  
extractAIC.negbin* 
[6] extractAIC.survreg*

   Non-visible functions are asterisked
 getAnywhere(extractAIC.coxph)
A single object matching ‘extractAIC.coxph’ was found
It was found in the following places
  registered S3 method for extractAIC from namespace stats
  namespace:stats
with value

function (fit, scale, k = 2, ...) 
{
edf - length(fit$coef)
loglik - fit$loglik[length(fit$loglik)]
c(edf, -2 * loglik + k * edf)
}
environment: namespace:stats
 

Thank you Christos.


That said, one of the advantages of getting
the source code is that it has comments that
are stripped out when the code is sourced into R

e.g. from the command line

 getAnywhere(AIC.default)
A single object matching ‘AIC.default’ was found
It was found in the following places
  registered S3 method for AIC from namespace stats
  namespace:stats
with value

function (object, ..., k = 2) 
{
ll - if (stats4 %in% loadedNamespaces()) 
stats4:::logLik
else logLik
if (length(list(...))) {
object - list(object, ...)
val - lapply(object, ll)
val - as.data.frame(t(sapply(val, function(el) c(attr(el, 
df), AIC(el, k = k)
names(val) - c(df, AIC)
Call - match.call()
Call$k - NULL
row.names(val) - as.character(Call[-1])
val
}
else AIC(ll(object), k = k)
}
environment: namespace:stats

From the source file


#  File src/library/stats/R/AIC.R
#  Part of the R package, http://www.R-project.org
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  A copy of the GNU General Public License is available at
#  http://www.r-project.org/Licenses/

 Return the object's value of the Akaike Information Criterion
 (or An Inf.. Crit..)

AIC - function(object, ..., k = 2) UseMethod(AIC)

## AIC for logLik objects
AIC.logLik - function(object, ..., k = 2)
-2 * c(object) + k * attr(object, df)

AIC.default - function(object, ..., k = 2)
{
## AIC for various fitted objects --- any for which there's a logLik() 
method:
ll - if(stats4 %in% loadedNamespaces()) stats4:::logLik else logLik
if(length(list(...))) {# several objects: produce data.frame
object - list(object, ...)
val - lapply(object, ll)
val - as.data.frame(t(sapply(val,
  function(el)
  c(attr(el, df), AIC(el, k = k)
names(val) - c(df, AIC)
Call - match.call()
Call$k - NULL
row.names(val) - as.character(Call[-1])
val
} else AIC(ll(object), k = k)
}



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.