Hi I know asking which test to use is frowned upon on this list... so please do read on for at least a couple on sentences...

I have some multivariate data slit as follows

Tumour Site (one of 5 categories) #
Chemo Schedule (one of 3 cats) ##
Cycle (one of 3 cats*) ##
Dose (one of 3 cats*) #

*These are actually integers but for all our other analysis so far we have grouped them into logical bands of categories.

The dependant variable is "Reaction" or "No Reaction"

I have individually analysed each of the independant variables against Reaction/No Reaction using ChiSq and Fisher Tests. Those marked ## produced p values less than 0.05, and those marked # produce p values close to 0.05.

We believe that Cycle is the crucial piece of data - the others just appear to be different because there are more early cycles in certain groups than others.

SO - I believe what I need to do is a Linear Logistic Regression on the 4 independant variables. And I'm expecting it to show that the tumour site, schedule and dose don't matter, only the cycle matters. Done a lot of reading and I'm clueless!!

I think I want to do something like:

glm (reaction ~ site + sched + cycle + dose, data=mydata, family=poisson)


I am then expecting to see some very long output with lots of numbers... ...my question is TWO fold -

1. is glm the right thing to use before I waste my time

and 2. how do I interpret the result! (I'm kind of expect a lecture here as I'm really looking for a nice snappy 'p<0.05 means this variable is the one having the influence' type answer and I suspect I'm going to be told thats not possible...!

To be clear the example given in the docs is:

 library(MASS)

 data(anorexia)

 anorex.1<- glm(Postwt ~ Prewt + Treat + offset(Prewt), family = gaussian, data 
= anorexia)


The output of anorex.1 is:

Call:  glm(formula = Postwt ~ Prewt + Treat + offset(Prewt), family = gaussian, 
     data = anorexia)

Coefficients:

(Intercept)        Prewt    TreatCont      TreatFT

    49.7711      -0.5655      -4.0971       4.5631

Degrees of Freedom: 71 Total (i.e. Null);  68 Residual

Null Deviance:        4525

Residual Deviance: 3311     AIC: 490



and the output of summary(anorex.1) is:

Call:

glm(formula = Postwt ~ Prewt + Treat + offset(Prewt), family = gaussian,

    data = anorexia)

Deviance Residuals:

     Min        1Q    Median        3Q       Max

-14.1083   -4.2773   -0.5484    5.4838   15.2922

Coefficients:

            Estimate Std. Error t value Pr(>|t|)

(Intercept)  49.7711    13.3910   3.717 0.000410 ***

Prewt        -0.5655     0.1612  -3.509 0.000803 ***

TreatCont    -4.0971     1.8935  -2.164 0.033999 *

TreatFT       4.5631     2.1333   2.139 0.036035 *

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 48.69504)

    Null deviance: 4525.4  on 71  degrees of freedom

Residual deviance: 3311.3  on 68  degrees of freedom

AIC: 489.97

Number of Fisher Scoring iterations: 2



---
Either can someone point me to a decent place that would explain what the means or provide me some pointers? i.e. which of the variables has the influence on the outcome in the anorexia data?

Please don't shout!! happy to be pointed to a reference but would prefer one in common english not some stats mumbo jumbo!

Calum

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to