[R] Re gression using age and Duration of disease as a continous factors

2009-07-20 Thread 1Rnwb

Please explain me as what it means and how this analysis can be done using R
and which library(ies) are needed.
Thanks

-- 
View this message in context: 
http://www.nabble.com/Regression-using-age-and-Duration-of-disease-as-a-continous-factors-tp24574133p24574133.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-20 Thread Greg Snow
If you need an explanation of what regression means, then you need to take a 
course or 2 at your local university, or at least hire a statistical consultant.

If you understand regression and just need the explanation of how to do it 
using R, then read section 11 (as well as everything else) of "An Introduction 
to R".

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of 1Rnwb
> Sent: Monday, July 20, 2009 11:30 AM
> To: r-help@r-project.org
> Subject: [R] Re gression using age and Duration of disease as a
> continous factors
> 
> 
> Please explain me as what it means and how this analysis can be done
> using R
> and which library(ies) are needed.
> Thanks
> 
> --
> View this message in context: http://www.nabble.com/Regression-using-
> age-and-Duration-of-disease-as-a-continous-factors-
> tp24574133p24574133.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-20 Thread Rolf Turner


On 21/07/2009, at 5:30 AM, 1Rnwb wrote:



Please explain me as what it means and how this analysis can be  
done using R

and which library(ies) are needed.
Thanks


Go stick your head in a pig! (***)

cheers,

Rolf Turner


(***) Motto of the Sirius Cybernetics Corporation --- see ``The  
Hitchhiker's

Guide to the Galaxy''.

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-20 Thread 1Rnwb

I have read that multiple times without understanding anything. 

Greg Snow-2 wrote:
> 
> If you need an explanation of what regression means, then you need to take
> a course or 2 at your local university, or at least hire a statistical
> consultant.
> 
> If you understand regression and just need the explanation of how to do it
> using R, then read section 11 (as well as everything else) of "An
> Introduction to R".
> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
> 
> 
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
>> project.org] On Behalf Of 1Rnwb
>> Sent: Monday, July 20, 2009 11:30 AM
>> To: r-help@r-project.org
>> Subject: [R] Re gression using age and Duration of disease as a
>> continous factors
>> 
>> 
>> Please explain me as what it means and how this analysis can be done
>> using R
>> and which library(ies) are needed.
>> Thanks
>> 
>> --
>> View this message in context: http://www.nabble.com/Regression-using-
>> age-and-Duration-of-disease-as-a-continous-factors-
>> tp24574133p24574133.html
>> Sent from the R help mailing list archive at Nabble.com.
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Regression-using-age-and-Duration-of-disease-as-a-continous-factors-tp24574133p24577937.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-20 Thread 1Rnwb

I thought this forum is for help. now i know what the statistician in my dept
does all day long

Rolf Turner-3 wrote:
> 
> 
> On 21/07/2009, at 5:30 AM, 1Rnwb wrote:
> 
>>
>> Please explain me as what it means and how this analysis can be  
>> done using R
>> and which library(ies) are needed.
>> Thanks
> 
> Go stick your head in a pig! (***)
> 
>   cheers,
> 
>   Rolf Turner
> 
> 
> (***) Motto of the Sirius Cybernetics Corporation --- see ``The  
> Hitchhiker's
> Guide to the Galaxy''.
> 
> ##
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Regression-using-age-and-Duration-of-disease-as-a-continous-factors-tp24574133p24577992.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-20 Thread Steve Lianoglou


On Jul 20, 2009, at 5:30 PM, 1Rnwb wrote:


I have read that multiple times without understanding anything.


If that's the case, then perhaps you should follow Greg's first piece  
of advice:



Greg Snow-2 wrote:


If you need an explanation of what regression means, then you need  
to take
a course or 2 at your local university, or at least hire a  
statistical

consultant.


We're not trying to be rude, but your question is quite ill formed,  
and no one can really help you:



Please explain me as what it means and how this analysis can be done
using R and which library(ies) are needed.


It's not clear what you do/don't understand, and your problem  
statement is too vague for anyone to tell you more.


It seems like you're saying you don't understand what "regression" is,  
in which case a simple email will not help you.


"Simply put" regression is a method to predict a (typically)  
"continuous" output by some combination of inputs, eg. predicting  
someone's height by knowing their weight and shoe size (these are  
continuous variables, too). It looks like in your case, your "inputs"  
are the "continuous factors" of your email subject, which are age and  
duration of disease?


You haven't even mentioned what it is you are trying to predict.  
Survival?


The thing is, as soon as one puts something in "simple terms," it's  
often wrong -- which is why Greg suggested taking a class or hiring  
someone to help you.


Anyway, I'm assuming you must know what regression is, otherwise you  
wouldn't be looking to know how to do it. One way to perform linear  
regression in R is using the "lm" function. Type ?lm at the R prompt  
for help.



I thought this forum is for help. now i know what the statistician  
in my dept

does all day long


It is for help -- you'll see it's quite active around here.

It's *not* for soliciting other people to do your analysis for you,  
which is how your email comes across. All of us have our own work to  
do, but are here to help if you're stuck on something *in  
particular* ... perhaps you can do a bit more legwork and rephrase  
your question in a more meaningful way.


-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-21 Thread John Kane



--- On Mon, 7/20/09, 1Rnwb  wrote:

> I thought this forum is for help. now i know what the
> statistician in my dept
> does all day long

Clearly he's not talking to you.  Your first step probably should be to go talk 
to him or her.


  __
[[elided Yahoo spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-21 Thread Greg Snow
If you truly did not understand anything in "An Introduction to R" section 11, 
then you are unlikely to understand anything that we would write in a post 
without getting some more background understanding (that is why I suggested 
that you take a class or hire a consultant).

There are many people on this list that give quite a bit of help every day, but 
the word 'help' in R-help means to give hints, or assist with when you do your 
part, or discuss and give insight.  It does not mean that we will do your work 
for you.

You are making it difficult for us to help you.  I for one am horrible at mind 
reading (just ask my wife).

Given the information in your original question, the answer may be:

> fit <- lm(Duration ~ age, data=disease)
> summary(fit)

But the above could also be completely useless, or give an error, or based on 
if some assumptions don't hold, the above could be worse than useless by giving 
results that are completely wrong but look good and lead you in the wrong 
direction.
If you try the above code and come back with only a statement about it not 
working or not understanding the output without any detail, then my sole 
response will be "I told you so!".

But if you can give us some detail on the background of your question, what 
question you are trying to answer, what you data looks like, and what your 
education/understanding is, then we may be able to help.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of 1Rnwb
> Sent: Monday, July 20, 2009 3:31 PM
> To: r-help@r-project.org
> Subject: Re: [R] Re gression using age and Duration of disease as a
> continous factors
> 
> 
> I have read that multiple times without understanding anything.
> 
> Greg Snow-2 wrote:
> >
> > If you need an explanation of what regression means, then you need to
> take
> > a course or 2 at your local university, or at least hire a
> statistical
> > consultant.
> >
> > If you understand regression and just need the explanation of how to
> do it
> > using R, then read section 11 (as well as everything else) of "An
> > Introduction to R".
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > greg.s...@imail.org
> > 801.408.8111
> >
> >
> >> -Original Message-
> >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> >> project.org] On Behalf Of 1Rnwb
> >> Sent: Monday, July 20, 2009 11:30 AM
> >> To: r-help@r-project.org
> >> Subject: [R] Re gression using age and Duration of disease as a
> >> continous factors
> >>
> >>
> >> Please explain me as what it means and how this analysis can be done
> >> using R
> >> and which library(ies) are needed.
> >> Thanks
> >>
> >> --
> >> View this message in context: http://www.nabble.com/Regression-
> using-
> >> age-and-Duration-of-disease-as-a-continous-factors-
> >> tp24574133p24574133.html
> >> Sent from the R help mailing list archive at Nabble.com.
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> 
> --
> View this message in context: http://www.nabble.com/Regression-using-
> age-and-Duration-of-disease-as-a-continous-factors-
> tp24574133p24577937.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-21 Thread 1Rnwb

Thanks Steve,Thanks for the explanation,  I agree the question is too vague,
I do not what a regression is I have switched to R a couple of months ago,
after working in Excel for a long time.  I also know the lm, glm functions
in R. but due to my data I am completely lost.  it looks like the experts
individuals just come to poke fun at our expesense who has no background of
statistics. 

I have a 8 proteins and I have two groups with 840 samples in control and
1140 samples in diseases further stratified by sex, draw age, duration of
disease. all these groups and sub groups is making the thing very confusing
as how to do the regression in R. the pupose is to show the changes in the
levels of these proteins as the disease progress or changes in their levels
with respect to progression in age, effect of gender, SNPs for these
proteins, it is a pretty big dataset. 

The suggestion that consult the statistician is kind of funny as  the
statistician in my center is my co-mentor and from past 5 years he is
sitting on the data without any output. 

I am not here to ask someone to do my data analysis, but to get an
understanding of the process as well as a proper direction to look for the
analysis.  after all I do have to explain all these things to my boss as
well. 

Thanks



Steve Lianoglou-6 wrote:
> 
> 
> On Jul 20, 2009, at 5:30 PM, 1Rnwb wrote:
> 
>> I have read that multiple times without understanding anything.
> 
> If that's the case, then perhaps you should follow Greg's first piece  
> of advice:
> 
>> Greg Snow-2 wrote:
>>>
>>> If you need an explanation of what regression means, then you need  
>>> to take
>>> a course or 2 at your local university, or at least hire a  
>>> statistical
>>> consultant.
> 
> We're not trying to be rude, but your question is quite ill formed,  
> and no one can really help you:
> 
>> Please explain me as what it means and how this analysis can be done
>> using R and which library(ies) are needed.
> 
> It's not clear what you do/don't understand, and your problem  
> statement is too vague for anyone to tell you more.
> 
> It seems like you're saying you don't understand what "regression" is,  
> in which case a simple email will not help you.
> 
> "Simply put" regression is a method to predict a (typically)  
> "continuous" output by some combination of inputs, eg. predicting  
> someone's height by knowing their weight and shoe size (these are  
> continuous variables, too). It looks like in your case, your "inputs"  
> are the "continuous factors" of your email subject, which are age and  
> duration of disease?
> 
> You haven't even mentioned what it is you are trying to predict.  
> Survival?
> 
> The thing is, as soon as one puts something in "simple terms," it's  
> often wrong -- which is why Greg suggested taking a class or hiring  
> someone to help you.
> 
> Anyway, I'm assuming you must know what regression is, otherwise you  
> wouldn't be looking to know how to do it. One way to perform linear  
> regression in R is using the "lm" function. Type ?lm at the R prompt  
> for help.
> 
> 
>> I thought this forum is for help. now i know what the statistician  
>> in my dept
>> does all day long
> 
> It is for help -- you'll see it's quite active around here.
> 
> It's *not* for soliciting other people to do your analysis for you,  
> which is how your email comes across. All of us have our own work to  
> do, but are here to help if you're stuck on something *in  
> particular* ... perhaps you can do a bit more legwork and rephrase  
> your question in a more meaningful way.
> 
> -steve
> 
> --
> Steve Lianoglou
> Graduate Student: Physiology, Biophysics and Systems Biology
> Weill Medical College of Cornell University
> 
> Contact Info: http://cbio.mskcc.org/~lianos/contact
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Regression-using-age-and-Duration-of-disease-as-a-continous-factors-tp24574133p24591056.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-21 Thread Ben Bolker



1Rnwb wrote:
> 
> 
>  [snip]
> 
> The suggestion that consult the statistician is kind of funny as  the
> statistician in my center is my co-mentor and from past 5 years he is
> sitting on the data without any output. 
> 
> I am not here to ask someone to do my data analysis, but to get an
> understanding of the process as well as a proper direction to look for the
> analysis.  after all I do have to explain all these things to my boss as
> well. 
> 
> 

Does your co-mentor have any graduate students, postdocs, etc. who could
help you?  Anyone else in your group with more statistical knowledge? Are
there courses in regression offered at your institution? This really looks
like a large and potentially complicated question, and it could take hours
of someone's (unpaid) time to help you through it.  At least people
associated with your institution will (1) know the field [and hence be able
to spot potential pitfalls], (2) be physically present [communication is
easier/faster].  Postdocs or graduate students might be willing to help you
out for the price of beer or coffee, or out of interest -- or for
co-authorship on a manuscript, if it comes to that.  

  There are books on regression geared toward R by Julian Faraway and Frank
Harrell, among others.  If repeated samples are drawn from the same
individual you may also need to consult Pinheiro and Bates 2000.

  At the risk of offending you further:

> library(fortunes)
> fortune("brain")

I wish to perform brain surgery this afternoon at 4pm and don't know where
to
start. My background is the history of great statistician sports legends but
I
am willing to learn. I know there are courses and numerous books on brain
surgery but I don't have the time for those. Please direct me to the
appropriate HowTos, and be on standby for solving any problem I may
encounter
while in the operating room. Some of you might ask for specifics of the
case,
but that would require my following the posting guide and spending even more
time than I am already taking to write this note.
   -- I. Ben Fooled (aka Frank Harrell)
  R-help (April 1, 2005)

-- 
View this message in context: 
http://www.nabble.com/Regression-using-age-and-Duration-of-disease-as-a-continous-factors-tp24574133p24592482.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-21 Thread Mehdi Khan
http://www.amazon.com/gp/product/007310874X/ref=pd_lpo_k2_dp_sr_1?pf_rd_p=304485901&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=0256117365&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=155Y7AP1SHTSJESHM15M

This is our textbook for regression analysis.  Go through the first 8 or 9
chapters and you're good.

Mehdi Khan

On Tue, Jul 21, 2009 at 9:29 AM, 1Rnwb  wrote:

>
> Thanks Steve,Thanks for the explanation,  I agree the question is too
> vague,
> I do not what a regression is I have switched to R a couple of months ago,
> after working in Excel for a long time.  I also know the lm, glm functions
> in R. but due to my data I am completely lost.  it looks like the experts
> individuals just come to poke fun at our expesense who has no background of
> statistics.
>
> I have a 8 proteins and I have two groups with 840 samples in control and
> 1140 samples in diseases further stratified by sex, draw age, duration of
> disease. all these groups and sub groups is making the thing very confusing
> as how to do the regression in R. the pupose is to show the changes in the
> levels of these proteins as the disease progress or changes in their levels
> with respect to progression in age, effect of gender, SNPs for these
> proteins, it is a pretty big dataset.
>
> The suggestion that consult the statistician is kind of funny as  the
> statistician in my center is my co-mentor and from past 5 years he is
> sitting on the data without any output.
>
> I am not here to ask someone to do my data analysis, but to get an
> understanding of the process as well as a proper direction to look for the
> analysis.  after all I do have to explain all these things to my boss as
> well.
>
> Thanks
>
>
>
> Steve Lianoglou-6 wrote:
> >
> >
> > On Jul 20, 2009, at 5:30 PM, 1Rnwb wrote:
> >
> >> I have read that multiple times without understanding anything.
> >
> > If that's the case, then perhaps you should follow Greg's first piece
> > of advice:
> >
> >> Greg Snow-2 wrote:
> >>>
> >>> If you need an explanation of what regression means, then you need
> >>> to take
> >>> a course or 2 at your local university, or at least hire a
> >>> statistical
> >>> consultant.
> >
> > We're not trying to be rude, but your question is quite ill formed,
> > and no one can really help you:
> >
> >> Please explain me as what it means and how this analysis can be done
> >> using R and which library(ies) are needed.
> >
> > It's not clear what you do/don't understand, and your problem
> > statement is too vague for anyone to tell you more.
> >
> > It seems like you're saying you don't understand what "regression" is,
> > in which case a simple email will not help you.
> >
> > "Simply put" regression is a method to predict a (typically)
> > "continuous" output by some combination of inputs, eg. predicting
> > someone's height by knowing their weight and shoe size (these are
> > continuous variables, too). It looks like in your case, your "inputs"
> > are the "continuous factors" of your email subject, which are age and
> > duration of disease?
> >
> > You haven't even mentioned what it is you are trying to predict.
> > Survival?
> >
> > The thing is, as soon as one puts something in "simple terms," it's
> > often wrong -- which is why Greg suggested taking a class or hiring
> > someone to help you.
> >
> > Anyway, I'm assuming you must know what regression is, otherwise you
> > wouldn't be looking to know how to do it. One way to perform linear
> > regression in R is using the "lm" function. Type ?lm at the R prompt
> > for help.
> >
> >
> >> I thought this forum is for help. now i know what the statistician
> >> in my dept
> >> does all day long
> >
> > It is for help -- you'll see it's quite active around here.
> >
> > It's *not* for soliciting other people to do your analysis for you,
> > which is how your email comes across. All of us have our own work to
> > do, but are here to help if you're stuck on something *in
> > particular* ... perhaps you can do a bit more legwork and rephrase
> > your question in a more meaningful way.
> >
> > -steve
> >
> > --
> > Steve Lianoglou
> > Graduate Student: Physiology, Biophysics and Systems Biology
> > Weill Medical College of Cornell University
> >
> > Contact Info: 
> > http://cbio.mskcc.org/~lianos/contact
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Regression-using-age-and-Duration-of-disease-as-a-continous-factors-tp24574133p24591056.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting gu

Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-21 Thread Steve Lianoglou

it looks like the experts
individuals just come to poke fun at our expesense who has no  
background of

statistics.


This isn't really a fair statement ... I'd simply suggest to be  
mindful of what you ask. It was as if you couldn't be bothered to take  
the time to fully describe your problem (how was anybody supposed to  
deduce what you explained below from your original email??), but  
wanted other people to take their time and to understand what you want  
and do your work for you.


When you look at it that way, it's not a big surprise that you  
received some of the answers you received. Lastly, I'm not sure how  
true this is through and through, or how relevant it is to *this  
particular scenario* but when people post to a somehow-professional  
list such as this one, I'd think it's generally frowned upon to use  
some bizarre alias instead of a real name (my 2 cents, there).


In any event, perhaps we can all move on.

As a disclaimer, anything I say from here on out would require taking  
with a grain of salt:


I have a 8 proteins and I have two groups with 840 samples in  
control and
1140 samples in diseases further stratified by sex, draw age,  
duration of
disease. all these groups and sub groups is making the thing very  
confusing
as how to do the regression in R. the pupose is to show the changes  
in the
levels of these proteins as the disease progress or changes in their  
levels

with respect to progression in age, effect of gender, SNPs for these
proteins, it is a pretty big dataset.


I'd start by trying to creating some clever graphics to see if you can  
eyeball any trends to see if you can get some juice out of further  
downstream analysis.


Anyway, I don't think there is a simple answer you can get from an  
email, and I'm a bit surprised that your statistician mentor doesn't  
have at least some idea of where to start. It sounds like you want to  
build some predictive model that uses the values in your predictor  
variables to predict some real valued expression of your protein(s) --  
and the problem is that there is no guarantee that you can do this  
with the data you have anyway (repeat after me: "research is fun").


That being said, one (overly) simple approach (there is no grouping/ 
subgrouping here) you can do is to use glmnet to and try to do lasso  
or elasticnet regression using all the factors you mention as  
predictor variables for the 8 different output vectors, which would be  
the individual expression of your proteins (so -- that's 8 different  
models you're trying to learn).


The hope is that the lasso will nuke some of the predictors (by  
setting their coefficients to 0) and help you find "the most  
important" factors that influence the protein expression ... in all  
likelihood, this probably won't work ... and if this is the type of  
answer you are looking to get, I'm not sure you will get anything  
satisfactory (repeat after me: "research is fun").



I am not here to ask someone to do my data analysis, but to get an
understanding of the process as well as a proper direction to look  
for the
analysis.  after all I do have to explain all these things to my  
boss as

well.


I'm not an expert, but there is no canned process to do this ... and  
like I said, there is no guarantee you can do this .. I mean, does it  
make sense to set up your problem in this way and expect a reasonable  
outcome (biologically speaking-wise)? Do you have to somehow take into  
account how these 8 proteins are interacting w/ each other? Many  
questions to answer ...


Anyway ... I'm not sure there's any real value in this email, but I've  
got my own fish to fry so time to move on ...


-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-21 Thread Steve Lianoglou


On Jul 21, 2009, at 1:56 PM, Mehdi Khan wrote:


http://www.amazon.com/gp/product/007310874X/ref=pd_lpo_k2_dp_sr_1?pf_rd_p=304485901&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=0256117365&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=155Y7AP1SHTSJESHM15M

This is our textbook for regression analysis.  Go through the first  
8 or 9

chapters and you're good.


By the by, do you like this book?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-21 Thread Marc Schwartz

On Jul 21, 2009, at 11:29 AM, 1Rnwb wrote:
Thanks Steve,Thanks for the explanation,  I agree the question is  
too vague,
I do not what a regression is I have switched to R a couple of  
months ago,
after working in Excel for a long time.  I also know the lm, glm  
functions
in R. but due to my data I am completely lost.  it looks like the  
experts
individuals just come to poke fun at our expesense who has no  
background of

statistics.

I have a 8 proteins and I have two groups with 840 samples in  
control and
1140 samples in diseases further stratified by sex, draw age,  
duration of
disease. all these groups and sub groups is making the thing very  
confusing
as how to do the regression in R. the pupose is to show the changes  
in the
levels of these proteins as the disease progress or changes in their  
levels

with respect to progression in age, effect of gender, SNPs for these
proteins, it is a pretty big dataset.

The suggestion that consult the statistician is kind of funny as  the
statistician in my center is my co-mentor and from past 5 years he is
sitting on the data without any output.

I am not here to ask someone to do my data analysis, but to get an
understanding of the process as well as a proper direction to look  
for the
analysis.  after all I do have to explain all these things to my  
boss as

well.

Thanks




First, welcome to R.

Not withstanding other replies, a key issue here is that the specific  
data and analytic domains for which you are querying are not ones that  
can be really learned remotely. These are not "simple" regression  
models and this is certainly not an area that the point and click  
approach of Excel would even begin to address, much less the plethora  
of other criticisms relevant to Excel's use for statistical analysis.


To the question that you pose in the final paragraph above, the proper  
direction for you at this point is to seek out a professional  
statistician with expertise in this particular domain. I would think  
that after 5 years, even your boss would be more comfortable in  
knowing that this was done with the requisite expertise applied.


It sounds like you are a clinical researcher/physician. If your  
current statistician is not in a position to offer assistance after 5  
years, for whatever reason, then as I note above, you need to seek  
another with experience in this domain who can work with you in close  
collaboration on this project. Neither statistician nor clinician  
should work in isolation here. It is the value in collaboration where  
each brings their own respective expertise to the table that results  
in a reasonable result.


The purpose of R's e-mail lists is not to provide general statistical  
consultancy, but to address specific issues as they pertain to R. Your  
initial queries fall into the former. In other words, your questions  
so far focus more on learning what are in fact, quite complex  
statistical methods and insights. That being said, there will be some  
interactions on the lists pertaining to general statistical issues  
when presented with *focused* questions, even though they may not be R  
specific.


The nature of your data suggest that you might benefit from the use of  
tools that have been made available via the Bioconductor project:


  http://www.bioconductor.org

which is built upon R and intended for this domain. There are entire  
books written on this subject in particular and on regression in  
general, some of which have been referenced by others in this thread.  
Bioconductor exists because it address specific needs for analytic  
tools within a statistical subspecialty, that R in general may not.


Just as there are specialties within medicine, they exist within  
statistics. You would not have an orthopaedic surgeon perform a mitral  
valve replacement any more than you would have a cardiac surgeon do a  
hip replacement, even though they are both surgeons, went to medical  
school and share general surgery training. They both went on to  
additional years of study within their specialties, diverging in their  
skills and knowledge base at that point.


The same in this domain.

There are fundamental questions that you will need to address  
regarding the means by which your data have been collected which can  
and will impact how you go about analyzing it. It sounds like this  
dataset may be the result of a retrospective collection process or  
'data of opportunity' rather than a prospective study design.


Do you have serial protein measurements from the same subjects over  
time, or will your time based hypotheses be inferred based upon single  
protein samples from each subject where the subjects happened to be  
available at differing ages and with differing disease duration/ 
progression at the time of data collection?


Why are there not equal sample sizes in your two groups? Does this  
infer sample selection bias that will have to be taken into account?  
What other source

Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-21 Thread Johannes Hüsing

Mehdi Khan schrieb:

http://www.amazon.com/gp/product/007310874X/ref=pd_lpo_k2_dp_sr_1?pf_rd_p=304485901&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=0256117365&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=155Y7AP1SHTSJESHM15M

This is our textbook for regression analysis.  Go through the first 8 or 9
chapters and you're good.

  

I think he'd be ill-advised to do so.

Modelling the course of a disease is a very tricky problem whic 
definitely asks for a statistician with a couple of years experience 
under his belt, and more than just a few hours of his time. Perusing a 
book on regression may be a good thing to do in order to be able to 
communicate to the statistician, or solve some (considerably) simpler 
problem oneself, but it's like perusing a book of anatomy before your 
first surgical intervention.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re gression using age and Duration of disease as a continous factors

2009-07-21 Thread Mehdi Khan
It was decent, I ended up not going to class and only going to discussions
and reading the book, got an A-.

On Tue, Jul 21, 2009 at 12:18 PM, Steve Lianoglou <
mailinglist.honey...@gmail.com> wrote:

>
> On Jul 21, 2009, at 1:56 PM, Mehdi Khan wrote:
>
>
>> http://www.amazon.com/gp/product/007310874X/ref=pd_lpo_k2_dp_sr_1?pf_rd_p=304485901&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=0256117365&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=155Y7AP1SHTSJESHM15M
>>
>> This is our textbook for regression analysis.  Go through the first 8 or 9
>> chapters and you're good.
>>
>
> By the by, do you like this book?
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.