[R] Compiling R as a shared library
Hi, I am trying to compile R as a shared library (need to run RMAGEML, which depends on SJava) on 64bit SUSE Linux 9.1 . I am using the source code for R 2.0.1 (Nov 15 04) > ./configure R_PAPERSIZE=LETTER > R_BROWSER=/opt/kde3/share/applications/kde/konqbrowser.desktop > --enable-R-shlib results in several errors (selected errors below), and make check also yields errors (at bottom) The errors seem to centre around conftest, with configure:29275: test -z || test ! -s conftest.err being fairly typical, occuring multiple times in config.log. Any advice on trouble shooting this would be greatly appreciated. Min-Han ___ configure:1908: checking for working aclocal-1.4 configure:1919: result: missing configure:1923: checking for working autoconf configure:1930: result: found configure:1938: checking for working automake-1.4 configure:1949: result: missing configure:4488: gcc -c -g -O2 -I/usr/local/include conftest.c >&5 conftest.c:2: error: parse error before "me" configure:4494: $? = 1 configure: failed program was: | #ifndef __cplusplus | choke me | #endif configure:4707: gcc -E -I/usr/local/include conftest.c conftest.c:16:28: ac_nonexistent.h: No such file or directory configure:4713: $? = 1 configure: failed program was: | /* confdefs.h. */ | | #define PACKAGE_NAME "R" | #define PACKAGE_TARNAME "R" | #define PACKAGE_VERSION "2.0.1" | #define PACKAGE_STRING "R 2.0.1" | #define PACKAGE_BUGREPORT "[EMAIL PROTECTED]" | #define PACKAGE "R" | #define VERSION "2.0.1" | #define R_PLATFORM "x86_64-unknown-linux-gnu" | #define R_CPU "x86_64" | #define R_VENDOR "unknown" | #define R_OS "linux-gnu" | #define Unix 1 | /* end confdefs.h. */ | #include configure:4814: gcc -E -I/usr/local/include conftest.c conftest.c:16:28: ac_nonexistent.h: No such file or directory configure:4820: $? = 1 configure: failed program was: | /* confdefs.h. */ | | #define PACKAGE_NAME "R" | #define PACKAGE_TARNAME "R" | #define PACKAGE_VERSION "2.0.1" | #define PACKAGE_STRING "R 2.0.1" | #define PACKAGE_BUGREPORT "[EMAIL PROTECTED]" | #define PACKAGE "R" | #define VERSION "2.0.1" | #define R_PLATFORM "x86_64-unknown-linux-gnu" | #define R_CPU "x86_64" | #define R_VENDOR "unknown" | #define R_OS "linux-gnu" | #define Unix 1 | /* end confdefs.h. */ | #include configure:5099: gcc -E -I/usr/local/include conftest.c conftest.c:16:28: ac_nonexistent.h: No such file or directory configure:5105: $? = 1 configure: failed program was: | /* confdefs.h. */ | | #define PACKAGE_NAME "R" | #define PACKAGE_TARNAME "R" | #define PACKAGE_VERSION "2.0.1" | #define PACKAGE_STRING "R 2.0.1" | #define PACKAGE_BUGREPORT "[EMAIL PROTECTED]" | #define PACKAGE "R" | #define VERSION "2.0.1" | #define R_PLATFORM "x86_64-unknown-linux-gnu" | #define R_CPU "x86_64" | #define R_VENDOR "unknown" | #define R_OS "linux-gnu" | #define Unix 1 | /* end confdefs.h. */ | #include configure:29307: checking whether isfinite is declared configure:29340: gcc -c -g -O2 -I/usr/local/include conftest.c >&5 conftest.c: In function `main': conftest.c:122: error: `isfinite' undeclared (first use in this function) conftest.c:122: error: (Each undeclared identifier is reported only once conftest.c:122: error: for each function it appears in.) configure:29346: $? = 1 configure: failed program was: | /* confdefs.h. */ | | #define PACKAGE_NAME "R" | #define PACKAGE_TARNAME "R" | #define PACKAGE_VERSION "2.0.1" | #define PACKAGE_STRING "R 2.0.1" | #define PACKAGE_BUGREPORT "[EMAIL PROTECTED]" | #define PACKAGE "R" | #define VERSION "2.0.1" | #define R_PLATFORM "x86_64-unknown-linux-gnu" | #define R_CPU "x86_64" | #define R_VENDOR "unknown" | #define R_OS "linux-gnu" | #define Unix 1 | #ifdef __cplusplus | extern "C" void std::exit (int) throw (); using std::exit; | #endif | #define STDC_HEADERS 1 | #define HAVE_SYS_TYPES_H 1 | #define HAVE_SYS_STAT_H 1 | #define HAVE_STDLIB_H 1 | #define HAVE_STRING_H 1 | #define HAVE_MEMORY_H 1 | #define HAVE_STRINGS_H 1 | #define HAVE_INTTYPES_H 1 | #define HAVE_STDINT_H 1 | #define HAVE_UNISTD_H 1 | #define HAVE_DLFCN_H 1 | #define HAVE_LIBM 1 | #define HAVE_LIBDL 1 | #define STDC_HEADERS 1 | #define TIME_WITH_SYS_TIME 1 | #define HAVE_DIRENT_H 1 | #define HAVE_SYS_WAIT_H 1 | #define HAVE_ARPA_INET_H 1 | #define HAVE_DLFCN_H 1 | #define HAVE_ELF_H 1 | #define HAVE_FCNTL_H 1 | #define HAVE_FPU_CONTROL_H 1 | #define HAVE_GRP_H 1 | #define HAVE_IEEE754_H 1 | #define HAVE_LIMITS_H 1 | #define HAVE_LOCALE_H 1 | #define HAVE_NETDB_H 1 | #define HAVE_NETINET_IN_H 1 | #define HAVE_PWD_H 1 | #define HAVE_STRINGS_H 1 | #define HAVE_SYS_PARAM_H 1 | #define HAVE_SYS_SELECT_H 1 | #define HAVE_SYS_SOCKET_H 1 | #define HAVE_SYS_STAT_H 1 | #define HAVE_SYS_TIME_H 1 | #define HAVE_SYS_TIMES_H 1 | #define HAVE_SYS_UTSNAME_H 1 | #define HAVE_UNISTD_H 1 | #define HAVE_WCHAR_H 1 | #define HAVE_ERRNO_H 1 | #define HAVE_STDARG_H 1 | #define HAVE_STRING_H 1 | #define HAVE_POSIX_SETJMP 1 | #define HAVE_GLIBC2 1
Re: [R] Postgraduate distance learning courses for MSc in Statistics
There are several institutions here which offer distance learning. http://programs.gradschools.com/distance/statistics.html Min-Han Tan On Mon, 31 Jan 2005 14:13:38 +0100, CG Pettersson <[EMAIL PROTECTED]> wrote: > 30 Januari, Tilo Blenk wrote: > >to > ... snip ... > > > Dear Tilo > As it was surprisingly silent on this question, I can offer a tip for > you, > nearer than both England or USA: > > At KVL in Copenhagen, they give several courses with well developed > distance methodology. They use both SAS and R, not totally parallell > though.. > > Check out the following link: > > http://statmaster.sdu.dk/ > > Cheers > /CG > > CG Pettersson, MSci, PhD Stud. > Swedish University of Agricultural Sciences > Dep. of Ecology and Crop Production. Box 7043 > SE-750 07 Uppsala > Sweden > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Successful installation of R 2.0.1 on SUSE 9.1
Hi, We managed to compile R 2.0.1 on 64-bit SUSE Linux 9.1 on a HP Proliant setup fairly uneventfully by following instructions on the R installation guide. We did encounter a minor hiccup in setting up x11, a problem which we note has been raised 4 or 5 times previously, but this was overcome thanks to this recent post by Peter Dalgaard on SUSE 9.1 and R.https://stat.ethz.ch/pipermail/r-help/2005-January/062397.html, as well as previous comments on the mailing list. One clarification on that post may be helpful: there are only 3 additional developmental packages required for successful X11 installation. XFree86-devel-4.3.99.902-30 fontconfig-devel freetype2-devel These were not available in YAST (9.1 SUSE), but were located in http://ftp.suse.com/pub/suse/ Once again, thanks to the assorted R gurus and wizards for making this mailing list such a great resource. Regards, Min-Han Tan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] hclust and heatmap - slightly different dendrograms?
Good afternoon, I ran heatmap and hclust on the same matrix x (strictly, I ran heatmap(x), and hclust(dist(t(x))), and realized that the two dendrograms were slightly different, in that the left-right arrangement of one pair of subclusters (columns) was reversed in the two functions (but all individual columns were grouped correctly). Looking through the code for heatmap as a most definite nonexpert, it seems to me that hclust is also invoked by heatmap. > heatmap function (x, Rowv = NULL, Colv = if (symm) "Rowv" else NULL, distfun = dist, hclustfun = hclust, add.expr, symm = FALSE, ... hcr <- hclustfun(distfun(x)) ddr <- as.dendrogram(hcr) hcc <- hclustfun(distfun(if (symm) x else t(x))) ddc <- as.dendrogram(hcc) I understand it is possible to add Rowv=NA and order the samples as per hclust, but I'm just wondering if there is a reason for this observation. Any pointers would be very much appreciated. Thanks! Min-Han Tan __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re: Estimating survival?
Apologies, I located the function in cph (... time.inc =30, surv = TRUE) ... $surv.summary Thanks. Min-Han On Tue, 2 Nov 2004 21:06:33 -0500, Min-Han Tan <[EMAIL PROTECTED]> wrote: > Hi, > > Sorry to trouble the list. I have a problem which I'm not sure how to resolve. > > I have a Cox model with 1 independent variable with 2 categories (and > thus 2 survival curves on plotting survfit) > > How can I get an estimate of survival for each category at a > particular time point, with standard error? > > Looking through ?cph and ?coxph, I'm not quite sure how to go about > that. I would really appreciate any direction here. > > Thanks a million! > > Min-Han > __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Estimating survival?
Hi, Sorry to trouble the list. I have a problem which I'm not sure how to resolve. I have a Cox model with 1 independent variable with 2 categories (and thus 2 survival curves on plotting survfit) How can I get an estimate of survival for each category at a particular time point, with standard error? Looking through ?cph and ?coxph, I'm not quite sure how to go about that. I would really appreciate any direction here. Thanks a million! Min-Han __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Fitting additional variables to model
Good morning, Sorry to trouble the list. I'm working on Cox models of survival, and am encountering a problem. I'm trying to group variables into some kind of new staging system By grouping, I mean : so-called 'integrated staging systems' for cancer merge categories of variables such as tumor stage, patient status into a single range of categories e.g. System I = Stage I or II, Patient Status 0 System II = Stage I or II, Patient Status 1 OR Stage III, Patient Status 0 So in this example, Stage I + II are grouped together, probably based on outcome. So in the scenario where each of the initial 2 variables A and B involved in the model have 4 categories: 1. Is there any other way to obtain a grouping the variables by outcome besides examining all possible 16 Kaplan Meier curves concurrently, and seeing how they group? Would it make sense to run pairwise survfits - but if so, what happens when more variables are introduced into the equation? Finally, is it possible to execute this in R? Thanks!!! 2. If there is a significant interaction between these 2 terms (A*B), does it even make sense to ask how I can perform "grouping" of the variables? Thanks in advance! Min-Han __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Validating a Cox model on an external set
Thank you all for your very insightful comments! And thank you for the directions to the packages! Re: non-statistical issues, yes, I was looking through Altman and Royston's Stat Med 2000 article "What do we mean by validating a prognostic model?" last night and it was very interesting. I'm working on expression profiling in tumor samples, and there are several difficulties in designing an experiment along those 'ideal' guidelines. A. Small sample size is of course the most common recurring problem, with concomitant even lower event rate. B. Patient recruitment issues are yet another issue, as many of the samples are degraded after time - the older the sample, the more degraded it usually is! So that biases selection towards recent cases. Different centres have different storage techniques, resulting in extensive degradation in samples from 1 centre and relatively intact samples from others. So there is no choice but to perform "data-driven selection" of cases - i.e. only samples which have good RNA. Other problems I've encountered include: C. Computational time. using a training sample size of 50 arrays, running a full internal cross-validation of a model derived using pamr.cv.cox took my computer about one and a half hours (with no other process running). (P4, 3 GHz, 2 GB RAM, R 1.9.1., Windows XP) And that's just *one* randomization! Min-Han On Tue, 28 Sep 2004 10:55:50 -0700, Berton Gunter <[EMAIL PROTECTED]> wrote: > > But note that there may be deeper, non-statistical, issues of what you mean > by "validation" here: how good must the predictions be on the validation > data? How similar or dissimilar should the validation data be to the > "training" data? To what end/population is the fitted model to be applied? > For example, AFAIK in most scientific research, a model is not considered > "validated" unless results can be substantively reproduced (??) in different > labs, sometimes with alternative methods. > > Think of the 1916 (I think it was) measurements of star positions during a > total solar eclipse to "validate" Einstein's Theory of General Relativity. > My point is not to say that this kind of "validation" is appropriate for a > Cox model, but only that the issues are worth thinking about. > > -- Bert Gunter > Genentech Non-Clinical Statistics > South San Francisco, CA > > "The business of the statistician is to catalyze the scientific learning > process." - George E. P. Box > > > > > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Frank > > E Harrell Jr > > Sent: Tuesday, September 28, 2004 10:11 AM > > To: Min-Han Tan > > Cc: [EMAIL PROTECTED] > > Subject: Re: [R] Validating a Cox model on an external set > > > > Min-Han Tan wrote: > > > Good morning, > > > > > > Sorry to trouble the list. > > > > > > I have a problem I hope to seek your advice on. > > > > > > Essentially, I am trying to 'validate' a multivariate Cox > > proportional > > > hazards model built in a training set, by testing it on an external > > > test set. I have performed a survfit using the Cox model to predict > > > survival for the test set, and obtained individual predictions for > > > survival time, with standard error for each test sample. > > Each of these > > > cases has an actual survival time, some censored. > > > > > > How can we decide whether the Cox model has been validated or not? > > > > This is what the Design package and its cph and validate.cph and > > calibrate.cph functions are for. > > > > > > > > I was suggested survdiff in the survival package, but survdiff works > > > between curves; am not sure how I could use it (I have a predicted > > > curve for each curve, but no 'observed curve' - the only observation > > > is death or censoring at time x) > > > > > > Thank you all so much! > > > > > > Min-Han Tan > > > Van Andel Institute > > > > > > __ > > > [EMAIL PROTECTED] mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > > > > > -- > > Frank E Harrell Jr Professor and Chair School of Medicine > > Department of Biostatistics > > Vanderbilt University > > > > __ > > > > [EMAIL PROTECTED] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > > __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Validating a Cox model on an external set
Good morning, Sorry to trouble the list. I have a problem I hope to seek your advice on. Essentially, I am trying to 'validate' a multivariate Cox proportional hazards model built in a training set, by testing it on an external test set. I have performed a survfit using the Cox model to predict survival for the test set, and obtained individual predictions for survival time, with standard error for each test sample. Each of these cases has an actual survival time, some censored. How can we decide whether the Cox model has been validated or not? I was suggested survdiff in the survival package, but survdiff works between curves; am not sure how I could use it (I have a predicted curve for each curve, but no 'observed curve' - the only observation is death or censoring at time x) Thank you all so much! Min-Han Tan Van Andel Institute __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Cox proportional hazards model
Good afternoon, I am currently trying to do some work on survival analysis. - I hope to seek your advice re: 2 questions (1 general and 1 specific) (1) I'm trying to do a stratified Cox analysis and subsequently plot(survfit(object)). It seems to work for some strata, but not for others. I have tumor grade, which is a range of 1 - 4. When I divide this range of 1:4 into 2 groups, it works fine for strata(grade>2) and strata(grade > 3). However, if I do a strata(grade>1), there is an error when I do a survfit( ) Call: survfit.coxph(object = s) Error in print.survfit(structure(list(n = as.integer(46), time = c(22, : length of dimnames [1] not equal to array extent (2) As a general question, is it possible to distinguish between confounding and interaction in the Cox proportional hazards model? Thanks! Min-Han __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re: brlr function
Thank you all. The reason was because the dependent should have been 0 and 1, rather than 1 and 2! Apologies for any trouble. Min-Han On Wed, 25 Aug 2004 18:40:38 -0400, Min-Han Tan <[EMAIL PROTECTED]> wrote: > Hi, > > I'm trying the brlr function in a penalized logistic regression function. > > However, I am not sure why I am encountering errors. I hope to seek > your advice here. (output below) > > Thank you! Your help is truly appreciated. > > Min-Han > > #No error here, the glm seems to work fine > > genes.cox1.glm1<-glm(as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+' > > #Something happened here ... I only substituted brlr for glm > > genes.cox1.glm1<-brlr(as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+' > Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1 > > #using the test data in brlr gives the same error if the dependent is > reduced to one column, is it a problem with my dependent? It works > fine as per the original vignette with > brlr(cbind(grahami,opalinus)~height ...) > > brlr(grahami~height+diameter,data=lizards) > Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1 > > #This is how the objects look > > as.integer(sim.cv.cox.yhat1) > [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 2 2 2 1 1 2 2 1 1 1 2 1 2 2 2 > 1 2 1 1 2 1 1 2 1 1 1 2 2 2 2 > > > as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+'))) > as.integer(sim.cv.cox.yhat1) ~ sim.dat.tst[27, ] + sim.dat.tst[35, >] + sim.dat.tst[17, ] + sim.dat.tst[41, ] + sim.dat.tst[38, >] + sim.dat.tst[31, ] + sim.dat.tst[3, ] + sim.dat.tst[32, >] + sim.dat.tst[16, ] + sim.dat.tst[48, ] + sim.dat.tst[13, >] + sim.dat.tst[28, ] > > > sim.dat.tst[27,] > Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 Sample 7 > Sample 8 Sample 9 Sample 10 Sample 11 Sample 12 Sample 13 Sample 14 > Sample 15 > 5.79 6.72 6.11 5.58 6.32 6.50 6.45 >5.77 7.46 6.32 5.64 5.77 5.44 5.83 > 5.57 > Sample 16 Sample 17 Sample 18 Sample 19 Sample 20 Sample 21 Sample 22 > Sample 23 Sample 24 Sample 25 Sample 26 Sample 27 Sample 28 Sample 29 > Sample 30 > 5.70 5.67 5.72 5.50 6.34 5.98 6.10 >6.25 6.13 7.40 7.28 8.37 5.73 6.83 > 5.72 > Sample 31 Sample 32 Sample 33 Sample 34 Sample 35 Sample 36 Sample 37 > Sample 38 Sample 39 Sample 40 Sample 41 Sample 42 Sample 43 Sample 44 > Sample 45 > 6.12 6.74 6.64 6.13 7.74 6.70 7.37 >6.54 6.49 5.75 6.18 6.41 7.68 5.36 > 6.34 > Sample 46 Sample 47 Sample 48 > 5.86 6.64 6.69 > __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] brlr function
Hi, I'm trying the brlr function in a penalized logistic regression function. However, I am not sure why I am encountering errors. I hope to seek your advice here. (output below) Thank you! Your help is truly appreciated. Min-Han #No error here, the glm seems to work fine > genes.cox1.glm1<-glm(as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+' #Something happened here ... I only substituted brlr for glm > genes.cox1.glm1<-brlr(as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+' Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1 #using the test data in brlr gives the same error if the dependent is reduced to one column, is it a problem with my dependent? It works fine as per the original vignette with brlr(cbind(grahami,opalinus)~height ...) > brlr(grahami~height+diameter,data=lizards) Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1 #This is how the objects look > as.integer(sim.cv.cox.yhat1) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 2 2 2 1 1 2 2 1 1 1 2 1 2 2 2 1 2 1 1 2 1 1 2 1 1 1 2 2 2 2 > as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+'))) as.integer(sim.cv.cox.yhat1) ~ sim.dat.tst[27, ] + sim.dat.tst[35, ] + sim.dat.tst[17, ] + sim.dat.tst[41, ] + sim.dat.tst[38, ] + sim.dat.tst[31, ] + sim.dat.tst[3, ] + sim.dat.tst[32, ] + sim.dat.tst[16, ] + sim.dat.tst[48, ] + sim.dat.tst[13, ] + sim.dat.tst[28, ] > sim.dat.tst[27,] Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 Sample 7 Sample 8 Sample 9 Sample 10 Sample 11 Sample 12 Sample 13 Sample 14 Sample 15 5.79 6.72 6.11 5.58 6.32 6.50 6.45 5.77 7.46 6.32 5.64 5.77 5.44 5.83 5.57 Sample 16 Sample 17 Sample 18 Sample 19 Sample 20 Sample 21 Sample 22 Sample 23 Sample 24 Sample 25 Sample 26 Sample 27 Sample 28 Sample 29 Sample 30 5.70 5.67 5.72 5.50 6.34 5.98 6.10 6.25 6.13 7.40 7.28 8.37 5.73 6.83 5.72 Sample 31 Sample 32 Sample 33 Sample 34 Sample 35 Sample 36 Sample 37 Sample 38 Sample 39 Sample 40 Sample 41 Sample 42 Sample 43 Sample 44 Sample 45 6.12 6.74 6.64 6.13 7.74 6.70 7.37 6.54 6.49 5.75 6.18 6.41 7.68 5.36 6.34 Sample 46 Sample 47 Sample 48 5.86 6.64 6.69 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] A troubled state of freedom: generalized linear models where number of parameters > number of samples
Good morning, Thank you all for your help so far. I really appreciate it. The crux of my problem is that I am generating a generalized linear model with 1 dependent variable, approximately 50 training samples and 100 parameters (gene levels). Essentially, if I have 100 genes and 50 samples, this results in coefficients for the first 49 samples, and NAs for the rest, with an ultra low residual deviance (usually approx. 10^-27). This seems to have something to do with the number of degrees of freedom (since as the number of genes increases up to 49, the number of residual degrees of freedom drops to 0) What kind of methods can I use to make sense of this? I have a subsequent set of samples to work on to validate the results of this glm, so I am not sure if overfitting is really a problem. Background: this is a microarray study, where I have divided the samples in the training set into 2 groups, and generated a number of genes to differentiate between both groups. I am going to use the GLM in a subsequent regression analysis to determine survival. For this purpose, I need to generate some kind of score for each individual case using the coefficients of each gene level * gene expression level. I am not a statistician (but a clinician) - many apologies if I am not conveying myself very clearly here! Thanks. Min-Han Tan __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Loss of rownames and colnames
Thank you all very much. I have solved the problem using lists of matrices. But yes, I will certainly look into the issue of the names for higher-dimensional arrays. Regards, Min-han __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Loss of rownames and colnames
Hi, I am working on some microarray data, and have some problems with writing iterations. In essence, the problem is that objects with three dimensions don't have rownames and colnames. These colnames and rownames would otherwise still be there in 2 dimensional objects. I need to generate multiple iterations of a 2 means-clustering algorithm, and these objects thus probably need 3 dimensions. My scripts are all written with heavy references to matching of colnames and rownames, so I am running into some problems here. (colnames = sample ids, and rownames = gene ids) My bad workaround solution so far has been to generate objects tagged with ".2", and have multiple blocks of code. e.g. test.1 <- ... test.2 <- ... test.x <- .. The obvious problem with this solution is that there does not seem to be an easy way of manipulating all these objects together without typing out their names individually. Thanks for any advice. Regards, Min-Han __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Colors in survival plots
Hi, Sorry to trouble the list - I would like to ask a question - I can't find the answer in the r-help archives. I am trying to plot 2 survival curves in different colors. What I have tried is this plot(survfit(sim.surv ~ sim.km.smpl),col="red") but both the survival curves turn red... Your advice would be most appreciated! Thank you. Min-Han __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html