Dear Eleni, from a previous post regarding maximum number of variables in a multiple linear regression analysis, posted last tuesday, and I think it can be relevant also to Cox PH models:
"I can think of no circumstance where multiple regression on "hundreds of thousands of variables" is anything more than a fancy random number generator" The thread is continued by someone having your same problem: "When I try a regression problem with 3,000 coefficients in R running under Windows XP 64 bit with 8Gb of memory on the machine and the /3Gb option active (i.e., R can get up to 3Gb), R 2.6.1 runs out of memory (apparently trying to duplicate the model matrix)" but the author continues... "...one must be careful doing ordinary linear regression with large numbers of coefficients. It does seem a little unlikely that there is sufficient data to get useful estimates of three thousand coefficients using linear regression" I also work with genomic data and it seems a well-accepted rule to filter data. I am sure not all of your 18000 genes are relevant to your study or have an effect on survival. Have a look at BioConductor mailing list for info on this topic. Best David > Hi David, > > The problem is that I need all these regressors. I need a coefficient for > every one of them and then rank them according to that coefficient. > > Thanks, > Eleni > > On Feb 12, 2008 4:54 PM, <[EMAIL PROTECTED]> wrote: > > > Hi Eleni, > > > > I am not an expert in R or statistics but in my opinion you have too > > many regressors compared to the number of observations and that might > > be the reason why you get the error. Others might say better but as > > far as I know, having only 80 observations, it is a good idea to first > > filter your list of variables down to a few tenths. > > > > > > HTH > > > > David > > > > > Hello R-community, > > > > > > It's been a week now that I am struggling with the implementation of > > a cox > > > model in R. I have 80 cancer patients, so 80 time measurements and 80 > > > relapse or no measurements (respective to censor, 1 if relapsed over > > the > > > examined period, 0 if not). My microarray data contain around 18000 > > genes. > > > So I have the expressions of 18000 genes in each of the 80 tumors > > (matrix > > > 80*18000). I would like to build a cox model in order to retrieve > > the most > > > significant genes (according to the p-value). The command that I am > > using > > > is: > > > > > > test1 <- list(time,relapse,genes) > > > coxph( Surv(time, relapse) ~ genes, test1) > > > > > > where time is a vector of size 80 containing the times, relapse is a > > vector > > > of size 80 containing the relapse values and genes is a matrix > > 80*18000. > > > When I give the coxph command I retrieve an error saying that cannot > > > allocate vector of size 2.7Mb (in Windows). I also tried linux and > > then I > > > receive error that maximum memory is reached. I increase the memory > > by > > > initializing R with the command: > > > R --min-vsize=10M --max-vsize=250M --min-nsize=1M --max- nsize=200M > > > > > > I think it cannot get better than that because if I try for example > > > max-vsize=300 the memomry capacity is stored as NA. > > > > > > Does anyone have any idea why this happens and how I can overcome it? > > > > > > I would be really grateful if you could help! > > > It has been bothering me a lot! > > > > > > Thank you all, > > > Eleni > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R- project.org/posting- > > guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.