Re: [R] How to apply calculations in "formula" to "data frame"

2017-03-28 Thread Sören Vogel
`model.matrix` was what I was looking for.

Thanks,
Sören

> On 28.03.2017, at 16:57, peter dalgaard <pda...@gmail.com> wrote:
> 
> 
>> On 28 Mar 2017, at 16:14 , Sören Vogel <soeren.vo...@posteo.ch> wrote:
>> 
>> Hello
>> 
>> Ho can I apply a formula to a data frame?
> 
> 
> That would depend on whether the formula has any special interpretation.
> 
> If if is just an elementary expression, then it would be like 
> 
> eval(For1[[3]], Data, environment(For1))
> 
> but you are using "." to represent... what exactly?
> 
> One possibility is model.matrix(For1, Data)
> 
> but I'm not at all sure that that is what you want.
> 
> -pd
> 
>> 
>> library("formula.tools")
>> Data <- data.frame("v1" = rnorm(31), "v2" = runif(31), "v3" = sample(1:7, 
>> 31, repl=T), "v4" = rlnorm(31))
>> For1 <- as.formula(v1 ~ .^3)
>> Lhs <- Data[, formula.tools::lhs.vars(formula)]
>> Rhs <- apply_formula_to_data_frame_and_return_result(data, formula) # ???
>> 
>> Thank you,
>> Sören
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to apply calculations in "formula" to "data frame"

2017-03-28 Thread Sören Vogel
Hello

Ho can I apply a formula to a data frame?

library("formula.tools")
Data <- data.frame("v1" = rnorm(31), "v2" = runif(31), "v3" = sample(1:7, 31, 
repl=T), "v4" = rlnorm(31))
For1 <- as.formula(v1 ~ .^3)
Lhs <- Data[, formula.tools::lhs.vars(formula)]
Rhs <- apply_formula_to_data_frame_and_return_result(data, formula) # ???

Thank you,
Sören

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different LLRs on multinomial logit models in R and SPSS

2011-01-09 Thread Sören Vogel
Hello, thanks for all your replies, it was a helpful lesson for me 
(and hopefully for my colleagues, too). Bests, Sören


On 11-01-07 11:23, David Winsemius wrote:


Date: Fri, 7 Jan 2011 11:23:04 -0500
From: David Winsemius dwinsem...@comcast.net
To: sovo0...@gmail.com
Cc: r-help@r-project.org
Subject: Re: [R] Different LLRs on multinomial logit models in R and SPSS


On Jan 7, 2011, at 8:26 AM, sovo0...@gmail.com wrote:


On Thu, 6 Jan 2011, David Winsemius wrote:


On Jan 6, 2011, at 11:23 AM, Sören Vogel wrote:


Thanks for your replies. I am no mathematician or statistician by far,
however, it appears to me that the actual value of any of the two LLs
is indeed important when it comes to calculation of
Pseudo-R-Squared-s. If Rnagel devides by (some transformation of) the
actiual value of llnull then any calculation of Rnagel should differ.
How come? Or is my function wrong? And if my function is right, how
can I calculate a R-Squared independent from the software used?


You have two models in that function, the null one with .~ 1 and the 
origianl one and you are getting a ratio on the likelihood scale (which is 
a difference on the log-likelihood or deviance scale).


If this is the case, calculating 'fit' indices for those models must end up 
in different fit indices depending on software:


n - 143
ll1 - 135.02
ll2 - 129.8
# Rcs
(Rcs - 1 - exp( (ll2 - ll1) / n ))
# Rnagel
Rcs / (1 - exp(-ll1/n))
ll3 - 204.2904
ll4 - 199.0659
# Rcs
(Rcs - 1 - exp( (ll4 - ll3) / n ))
# Rnagel
Rcs / (1 - exp(-ll3/n))

The Rcs' are equal, however, the Rnagel's are not. Of course, this is no 
question, but I am rather confused. When publishing results I am required 
to use fit indices and editors would complain that they differ.


It is well known that editors are sometimes confused about statistics, and if 
an editor is insistent on publishing indices that are in fact arbitrary then 
that is a problem. I would hope that the editor were open to education. (And 
often there is a statistical associate editor who will be more likely to have 
a solid grounding and to whom one can appeal in situations of initial 
obstinancy.)  Perhaps you will be doing the world of science a favor by 
suggesting that said editor first review a first-year calculus text regarding 
the fact that indefinite integrals are only calculated up to a arbitrary 
constant and that one can only use the results in a practical setting by 
specifying the limits of integration. So it is with likelihoods. They are 
only meaningful when comparing two nested models. Sometimes the software 
obscures this fact, but it remains a statistical _fact_.


Whether you code is correct (and whether the Nagelkerke R^2 remain 
invariant with respect to such transformations) I cannot say. (I suspect that 
it would be, but I have never liked the NagelR2 as a measure, and didn't 
really like R^2 as a measure in linear regression for that matter, either.) I 
focus on fitting functions to trends, examining predictions, and assessing 
confidence intervals for parameter estimates. The notion that model fit is 
well-summarized in a single number blinds one to other critical issues such 
as the linearity and monotonicity assumptions implicit in much of regression 
(mal-)practice.


So, if someone who is more enamored of (or even more knowledgeably scornful 
of)  the Nagelkerke R^2 measure wants to take over here, I will read what 
they say with interest and appreciation.




Sören


David Winsemius, MD
West Hartford, CT



--
Sören Vogel, sovo0...@gmail.com, http://sovo0815.wordpress.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Different LLRs on multinomial logit models in R and SPSS

2011-01-06 Thread Sören Vogel
Hello, after calculating a multinomial logit regression on my data, I
compared the output to an output retrieved with SPSS 18 (Mac). The
coefficients appear to be the same, but the logLik (and therefore fit)
values differ widely. Why?

The regression in R:

set.seed(1234)
df - data.frame(
  y=factor(sample(LETTERS[1:3], 143, repl=T, prob=c(4, 1, 10))),
  a=sample(1:5, 143, repl=T),
  b=sample(1:7, 143, repl=T),
  c=sample(1:2, 143, repl=T)
)
library(nnet)
mod1 - multinom(y ~ ., data=df, trace=F)
deviance(mod1) # 199.0659
mod0 - update(mod1, . ~ 1, trace=FALSE)
deviance(mod0) # 204.2904

Output data and syntax for SPSS:

df2 - df
df2[, 1] - as.numeric(df[, 1])
write.csv(df2, file=dfxy.csv, row.names=F, na=)
syntaxfile - dfxy.sps
cat('GET DATA
  /TYPE=TXT
  /FILE=\'', getwd(), '/dfxy.csv\'
  /DELCASE=LINE
  /DELIMITERS=,
  /QUALIFIER=\'\'
  /ARRANGEMENT=DELIMITED
  /FIRSTCASE=2
  /IMPORTCASE=ALL
  /VARIABLES=
  y F1.0
  a F8.4
  b F8.4
  c F8.4.
CACHE.
EXECUTE.
DATASET NAME DataSet1 WINDOW=FRONT.

VALUE LABELS
  /y 1 A 2 B 3 C.
EXECUTE.

NOMREG y (BASE=1 ORDER=ASCENDING) WITH a b c
  /CRITERIA CIN(95) DELTA(0) MXITER(100) MXSTEP(5) CHKSEP(20)
LCONVERGE(0) PCONVERGE(0.01)
SINGULAR(0.0001)
  /MODEL
  /STEPWISE=PIN(.05) POUT(0.1) MINEFFECT(0) RULE(SINGLE)
ENTRYMETHOD(LR) REMOVALMETHOD(LR)
  /INTERCEPT=INCLUDE
  /PRINT=FIT PARAMETER SUMMARY LRT CPS STEP MFI IC.
', file=syntaxfile, sep=, append=F)

- Loglik0: 135.02
- Loglik1: 129.80

Thanks, Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Different LLRs on multinomial logit models in R and SPSS

2011-01-06 Thread Sören Vogel
Thanks for your replies. I am no mathematician or statistician by far,
however, it appears to me that the actual value of any of the two LLs
is indeed important when it comes to calculation of
Pseudo-R-Squared-s. If Rnagel devides by (some transformation of) the
actiual value of llnull then any calculation of Rnagel should differ.
How come? Or is my function wrong? And if my function is right, how
can I calculate a R-Squared independent from the software used?

Rfits - function(mod) {
  llnull - deviance(update(mod, . ~ 1, trace=F))
  llmod - deviance(mod)
  n - length(predict(mod))
  Rcs - 1 - exp( (llmod - llnull) / n )
  Rnagel - Rcs / (1 - exp(-llnull/n))
  out - list(
Rcs=Rcs,
Rnagel=Rnagel
  )
  class(out) - c(list, table)
  return(out)
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.