Re: [R] Linear Logistic Regression - Understanding the output (and possibly the test to use!)

2010-09-05 Thread stats

David Winsemius wrote:


1. is glm the right thing to use before I waste my time


Yes, but if your outcome variable is binomial then the family argument 
should be  binomial. (And if you thought it should be poisson, 
then why below did you use gaussian???
Used gaussian below because it was the example from the docs.  Thats not 
my data, its example data which was not binomial.




and 2. how do I interpret the result!


Result? What result? I do see any description of your data, nor any code.
I didn't provide MY DATA because I thought that would complicate things 
even further.  So I was hoping for some advice on how to interpret the 
result of the example data so that I could then apply that to my data.   
I haven't even tried to run my data as I couldn't see what the output of 
the examples was trying to tell me.


However, as you've snipped it because it was not relevant thats useful 
to know.  I often find this problem with the examples in the R doc's 
they suddenly take a dataset that I have no knowledege of and play with 
it and produce an 'answer'.  The examples are presumably provided to 
enable me to work through how the code works etc.  So what I was hoping 
for was someone to point to somewhere on-line that documents how to use 
the function for logistic regression and to explain what all that table 
of data it spits out actually meant.  Someone has VERY KINDLY posted me 
something off list which I believe helps.


I think you need to consult a statistician or someone who has taken 
the time to read that statistical mumbo jumbo you don't want to 
learn. This mailing list is not set up to be a tutorial site.
I have access to stats advice, but I don't (a) want to turn up to them 
with a pile of paper from R and them say glm() may be the wrong 
analaysis (b) they don't do R so they can't tell me if I've used R 
wrongly and (c) I completely expect they'd say which of the values in 
the table matter since no paper I've ever seen published showed a 
logistic regression with a table of numbers.


I have a couple of Kleinbaum's (et al) other texts and find them to be 
well written and reasoned, so I suspect the citation above would be as 
accessible as any.


Thank you, that is useful.  There is a real problem when buying R text 
books.  None of the bookshops round here stock any which means you can't 
tell if they are much good.  I've looked at some and they seem to be 
re-writes of the help files.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Logistic Regression - Understanding the output (and possibly the test to use!)

2010-09-05 Thread David Winsemius


On Sep 5, 2010, at 6:06 AM, st...@wittongilbert.free-online.co.uk wrote:


David Winsemius wrote:


1. is glm the right thing to use before I waste my time


Yes, but if your outcome variable is binomial then the family  
argument should be  binomial. (And if you thought it should  
be poisson, then why below did you use gaussian???
Used gaussian below because it was the example from the docs.  Thats  
not my data, its example data which was not binomial.




and 2. how do I interpret the result!


Result? What result? I do see any description of your data, nor any  
code.
I didn't provide MY DATA because I thought that would complicate  
things even further.  So I was hoping for some advice on how to  
interpret the result of the example data so that I could then apply  
that to my data.   I haven't even tried to run my data as I couldn't  
see what the output of the examples was trying to tell me.


I didn't think that providing commentary on ols regression results was  
going to be that germane to setting up and running logistic  
regression. Why haven't you tried a Google search for tutorials. When  
I did that I found:


http://www.ats.ucla.edu/stat/r/dae/logit.htm

Surely there are others.

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Logistic Regression - Understanding the output (and possibly the test to use!)

2010-09-05 Thread Peng, C


Calum-4 wrote:
 
 Hi I know asking which test to use is frowned upon on this list... so 
 please do read on for at least a couple on sentences...
 
 I have some multivariate data slit as follows
 
 Tumour Site (one of 5 categories) #
 Chemo Schedule (one of 3 cats) ##
 Cycle (one of 3 cats*) ##
 Dose (one of 3 cats*) #
 
 *These are actually integers but for all our other analysis so far we 
 have grouped them into logical bands of categories.
 
 The dependant variable is Reaction or No Reaction
 
 I have individually analysed each of the independant variables against 
 Reaction/No Reaction using ChiSq and Fisher Tests. Those marked ## 
 produced p values less than 0.05, and those marked # produce p values 
 close to 0.05.
 
 We believe that Cycle is the crucial piece of data - the others just 
 appear to be different because there are more early cycles in certain 
 groups than others.
 
 SO - I believe what I need to do is a Linear Logistic Regression on the 
 4 independant variables. And I'm expecting it to show that the tumour 
 site, schedule and dose don't matter, only the cycle matters. Done a lot 
 of reading and I'm clueless!!
 
 I think I want to do something like:
 
 glm (reaction ~ site + sched + cycle + dose, data=mydata, family=poisson)
 =
 Comment  1: If you stick to Linear Logistic Regression, the family should
 be binomial assuming that reaction has only two values (Yes/No).
 family=poisson should be used when the response is a frequency count
 such as the number of tumors.
 =
 
 I am then expecting to see some very long output with lots of numbers... 
 ...my question is TWO fold -
 
 1. is glm the right thing to use before I waste my time
 
 and 2. how do I interpret the result! (I'm kind of expect a lecture here 
 as I'm really looking for a nice snappy 'p0.05 means this variable is 
 the one having the influence' type answer and I suspect I'm going to be 
 told thats not possible...!
 
 Comment 2: The regression coefficients in binary logistic regression
 models are called log-odds ratio. The interpretation of odds ratio can be
 tricky but the p-value is interpreted in the usual way.
 
 To be clear the example given in the docs is:
 
  library(MASS)
 
  data(anorexia)
 
  anorex.1- glm(Postwt ~ Prewt + Treat + offset(Prewt), family =
 gaussian, data = anorexia)
 
 ===
 Comment 3. Here Postwt is a continuous variable. The specification family
 = gaussian assumes the that Postwt is a normal variable, therefore, the
 fitted model is the ordinary normal linear regression model.
 ===
 
 The output of anorex.1 is:
 
 Call:  glm(formula = Postwt ~ Prewt + Treat + offset(Prewt), family =
 gaussian,  data = anorexia)
 
 Coefficients:
 
 (Intercept)PrewtTreatCont  TreatFT
 
  49.7711  -0.5655  -4.0971   4.5631
 
 Degrees of Freedom: 71 Total (i.e. Null);  68 Residual
 
 Null Deviance:4525
 
 Residual Deviance: 3311 AIC: 490
 
 
 
 and the output of summary(anorex.1) is:
 
 Call:
 
 glm(formula = Postwt ~ Prewt + Treat + offset(Prewt), family = gaussian,
 
  data = anorexia)
 
 Deviance Residuals:
 
   Min1QMedian3Q   Max
 
 -14.1083   -4.2773   -0.54845.4838   15.2922
 
 Coefficients:
 
  Estimate Std. Error t value Pr(|t|)
 
 (Intercept)  49.771113.3910   3.717 0.000410 ***
 
 Prewt-0.5655 0.1612  -3.509 0.000803 ***
 
 TreatCont-4.0971 1.8935  -2.164 0.033999 *
 
 TreatFT   4.5631 2.1333   2.139 0.036035 *
 
 ---
 
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
 (Dispersion parameter for gaussian family taken to be 48.69504)
 
  Null deviance: 4525.4  on 71  degrees of freedom
 
 Residual deviance: 3311.3  on 68  degrees of freedom
 
 AIC: 489.97
 
 Number of Fisher Scoring iterations: 2
 
 
 
 ---
 Either can someone point me to a decent place that would explain what 
 the means or provide me some pointers? i.e. which of the variables has 
 the influence on the outcome in the anorexia data?
 
 Please don't shout!! happy to be pointed to a reference but would prefer 
 one in common english not some stats mumbo jumbo!
 
 Calum
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/non-zero-exit-status-error-when-install-GenomeGraphs-tp2526950p2527317.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list

Re: [R] Linear Logistic Regression - Understanding the output (and possibly the test to use!)

2010-09-05 Thread Frank Harrell

On Sun, 5 Sep 2010, st...@wittongilbert.free-online.co.uk wrote:


David Winsemius wrote:


1. is glm the right thing to use before I waste my time


Yes, but if your outcome variable is binomial then the family argument
should be  binomial. (And if you thought it should be poisson,
then why below did you use gaussian???

Used gaussian below because it was the example from the docs.  Thats not
my data, its example data which was not binomial.



and 2. how do I interpret the result!


Result? What result? I do see any description of your data, nor any code.

I didn't provide MY DATA because I thought that would complicate things
even further.  So I was hoping for some advice on how to interpret the
result of the example data so that I could then apply that to my data.
I haven't even tried to run my data as I couldn't see what the output of
the examples was trying to tell me.

However, as you've snipped it because it was not relevant thats useful
to know.  I often find this problem with the examples in the R doc's
they suddenly take a dataset that I have no knowledege of and play with
it and produce an 'answer'.  The examples are presumably provided to
enable me to work through how the code works etc.  So what I was hoping
for was someone to point to somewhere on-line that documents how to use
the function for logistic regression and to explain what all that table
of data it spits out actually meant.  Someone has VERY KINDLY posted me
something off list which I believe helps.


I think you need to consult a statistician or someone who has taken
the time to read that statistical mumbo jumbo you don't want to
learn. This mailing list is not set up to be a tutorial site.

I have access to stats advice, but I don't (a) want to turn up to them
with a pile of paper from R and them say glm() may be the wrong
analaysis (b) they don't do R so they can't tell me if I've used R
wrongly and (c) I completely expect they'd say which of the values in
the table matter since no paper I've ever seen published showed a
logistic regression with a table of numbers.


Clearly the time to consult a statistician is before you have done any 
statistical analysis.


Frank Harrell



I have a couple of Kleinbaum's (et al) other texts and find them to be
well written and reasoned, so I suspect the citation above would be as
accessible as any.


Thank you, that is useful.  There is a real problem when buying R text
books.  None of the bookshops round here stock any which means you can't
tell if they are much good.  I've looked at some and they seem to be
re-writes of the help files.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Logistic Regression - Understanding the output (and possibly the test to use!)

2010-09-04 Thread David Winsemius


On Sep 4, 2010, at 6:53 PM, st...@wittongilbert.free-online.co.uk wrote:

Hi I know asking which test to use is frowned upon on this list...  
so please do read on for at least a couple on sentences...


I have some multivariate data slit as follows

Tumour Site (one of 5 categories) #
Chemo Schedule (one of 3 cats) ##
Cycle (one of 3 cats*) ##
Dose (one of 3 cats*) #

*These are actually integers but for all our other analysis so far  
we have grouped them into logical bands of categories.


The dependant variable is Reaction or No Reaction

I have individually analysed each of the independant variables  
against Reaction/No Reaction using ChiSq and Fisher Tests. Those  
marked ## produced p values less than 0.05, and those marked #  
produce p values close to 0.05.


We believe that Cycle is the crucial piece of data - the others just  
appear to be different because there are more early cycles in  
certain groups than others.


SO - I believe what I need to do is a Linear Logistic Regression on  
the 4 independant variables. And I'm expecting it to show that the  
tumour site, schedule and dose don't matter, only the cycle matters.  
Done a lot of reading and I'm clueless!!


I think I want to do something like:

glm (reaction ~ site + sched + cycle + dose, data=mydata,  
family=poisson)



I am then expecting to see some very long output with lots of  
numbers... ...my question is TWO fold -


1. is glm the right thing to use before I waste my time


Yes, but if your outcome variable is binomial then the family argument  
should be  binomial. (And if you thought it should be poisson,  
then why below did you use gaussian???


and 2. how do I interpret the result!


Result? What result? I do see any description of your data, nor any  
code.


(I'm kind of expect a lecture here as I'm really looking for a nice  
snappy 'p0.05 means this variable is the one having the influence'  
type answer and I suspect I'm going to be told thats not possible...!


I think you need to consult a statistician or someone who has taken  
the time to read that statistical mumbo jumbo you don't want to  
learn. This mailing list is not set up to be a tutorial site.


(Re your request below: Some years ago I saw one of those programmed  
learning texts by Kleinbaum on logistic regression. Maybe you could  
read it and see if it makes your consulting sessions go more smoothly.)


http://www.bookfinder.com/search/?author=kleinbaumtitle=logistic+regressionlang=enisbn=submit=Begin+searchnew_used=*destination=uscurrency=USDmode=basicst=srac=qr

I have a couple of Kleinbaum's (et al) other texts and find them to be  
well written and reasoned, so I suspect the citation above would be as  
accessible as any.




To be clear the example given in the docs is:


library(MASS)


snipped an example that was not relevant to logistic regression


---
Either can someone point me to a decent place that would explain  
what the means or provide me some pointers? i.e. which of the  
variables has the influence on the outcome in the anorexia data?


Please don't shout!! happy to be pointed to a reference but would  
prefer one in common english not some stats mumbo jumbo!


Calum


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.