from:"Min\-Han Tan"

[R] Compiling R as a shared library

2005-02-08 Thread Min-Han Tan

Hi,

I am trying to compile R as a shared library (need to run RMAGEML,
which depends on SJava) on 64bit SUSE Linux 9.1 . I am using the
source code for R 2.0.1 (Nov 15 04)

>  ./configure R_PAPERSIZE=LETTER 
> R_BROWSER=/opt/kde3/share/applications/kde/konqbrowser.desktop 
> --enable-R-shlib

results in several errors (selected errors below), and make check also
yields errors (at bottom) The errors seem to centre around conftest,
with
configure:29275: test -z 
 || test ! -s conftest.err
being fairly typical, occuring multiple times in config.log.

Any advice on trouble shooting this would be greatly appreciated. 

Min-Han


___


configure:1908: checking for working aclocal-1.4
configure:1919: result: missing
configure:1923: checking for working autoconf
configure:1930: result: found
configure:1938: checking for working automake-1.4
configure:1949: result: missing

configure:4488: gcc -c -g -O2 -I/usr/local/include conftest.c >&5
conftest.c:2: error: parse error before "me"
configure:4494: $? = 1
configure: failed program was:
| #ifndef __cplusplus
|   choke me
| #endif

configure:4707: gcc -E -I/usr/local/include conftest.c
conftest.c:16:28: ac_nonexistent.h: No such file or directory
configure:4713: $? = 1
configure: failed program was:
| /* confdefs.h.  */
| 
| #define PACKAGE_NAME "R"
| #define PACKAGE_TARNAME "R"
| #define PACKAGE_VERSION "2.0.1"
| #define PACKAGE_STRING "R 2.0.1"
| #define PACKAGE_BUGREPORT "[EMAIL PROTECTED]"
| #define PACKAGE "R"
| #define VERSION "2.0.1"
| #define R_PLATFORM "x86_64-unknown-linux-gnu"
| #define R_CPU "x86_64"
| #define R_VENDOR "unknown"
| #define R_OS "linux-gnu"
| #define Unix 1
| /* end confdefs.h.  */
| #include 


configure:4814: gcc -E -I/usr/local/include conftest.c
conftest.c:16:28: ac_nonexistent.h: No such file or directory
configure:4820: $? = 1
configure: failed program was:
| /* confdefs.h.  */
| 
| #define PACKAGE_NAME "R"
| #define PACKAGE_TARNAME "R"
| #define PACKAGE_VERSION "2.0.1"
| #define PACKAGE_STRING "R 2.0.1"
| #define PACKAGE_BUGREPORT "[EMAIL PROTECTED]"
| #define PACKAGE "R"
| #define VERSION "2.0.1"
| #define R_PLATFORM "x86_64-unknown-linux-gnu"
| #define R_CPU "x86_64"
| #define R_VENDOR "unknown"
| #define R_OS "linux-gnu"
| #define Unix 1
| /* end confdefs.h.  */
| #include 


configure:5099: gcc -E -I/usr/local/include conftest.c
conftest.c:16:28: ac_nonexistent.h: No such file or directory
configure:5105: $? = 1
configure: failed program was:
| /* confdefs.h.  */
| 
| #define PACKAGE_NAME "R"
| #define PACKAGE_TARNAME "R"
| #define PACKAGE_VERSION "2.0.1"
| #define PACKAGE_STRING "R 2.0.1"
| #define PACKAGE_BUGREPORT "[EMAIL PROTECTED]"
| #define PACKAGE "R"
| #define VERSION "2.0.1"
| #define R_PLATFORM "x86_64-unknown-linux-gnu"
| #define R_CPU "x86_64"
| #define R_VENDOR "unknown"
| #define R_OS "linux-gnu"
| #define Unix 1
| /* end confdefs.h.  */
| #include 

configure:29307: checking whether isfinite is declared
configure:29340: gcc -c -g -O2 -I/usr/local/include conftest.c >&5
conftest.c: In function `main':
conftest.c:122: error: `isfinite' undeclared (first use in this function)
conftest.c:122: error: (Each undeclared identifier is reported only once
conftest.c:122: error: for each function it appears in.)
configure:29346: $? = 1
configure: failed program was:
| /* confdefs.h.  */
| 
| #define PACKAGE_NAME "R"
| #define PACKAGE_TARNAME "R"
| #define PACKAGE_VERSION "2.0.1"
| #define PACKAGE_STRING "R 2.0.1"
| #define PACKAGE_BUGREPORT "[EMAIL PROTECTED]"
| #define PACKAGE "R"
| #define VERSION "2.0.1"
| #define R_PLATFORM "x86_64-unknown-linux-gnu"
| #define R_CPU "x86_64"
| #define R_VENDOR "unknown"
| #define R_OS "linux-gnu"
| #define Unix 1
| #ifdef __cplusplus
| extern "C" void std::exit (int) throw (); using std::exit;
| #endif
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_DLFCN_H 1
| #define HAVE_LIBM 1
| #define HAVE_LIBDL 1
| #define STDC_HEADERS 1
| #define TIME_WITH_SYS_TIME 1
| #define HAVE_DIRENT_H 1
| #define HAVE_SYS_WAIT_H 1
| #define HAVE_ARPA_INET_H 1
| #define HAVE_DLFCN_H 1
| #define HAVE_ELF_H 1
| #define HAVE_FCNTL_H 1
| #define HAVE_FPU_CONTROL_H 1
| #define HAVE_GRP_H 1
| #define HAVE_IEEE754_H 1
| #define HAVE_LIMITS_H 1
| #define HAVE_LOCALE_H 1
| #define HAVE_NETDB_H 1
| #define HAVE_NETINET_IN_H 1
| #define HAVE_PWD_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_SYS_PARAM_H 1
| #define HAVE_SYS_SELECT_H 1
| #define HAVE_SYS_SOCKET_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_SYS_TIME_H 1
| #define HAVE_SYS_TIMES_H 1
| #define HAVE_SYS_UTSNAME_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_WCHAR_H 1
| #define HAVE_ERRNO_H 1
| #define HAVE_STDARG_H 1
| #define HAVE_STRING_H 1
| #define HAVE_POSIX_SETJMP 1
| #define HAVE_GLIBC2 1

Re: [R] Postgraduate distance learning courses for MSc in Statistics

2005-01-31 Thread Min-Han Tan

There are several institutions here which offer distance learning.

http://programs.gradschools.com/distance/statistics.html

Min-Han Tan

On Mon, 31 Jan 2005 14:13:38 +0100, CG Pettersson
<[EMAIL PROTECTED]> wrote:
> 30 Januari, Tilo Blenk  wrote:
> 
>to
>  ... snip ...
>
>  
> Dear Tilo
> As it was surprisingly silent on this question, I can offer a tip for
> you,
> nearer than both England or USA:
> 
> At KVL in Copenhagen, they give several courses with well developed
> distance methodology. They use both SAS and R, not totally parallell
> though..
> 
> Check out the following link:
> 
> http://statmaster.sdu.dk/
> 
> Cheers
> /CG
> 
> CG Pettersson, MSci, PhD Stud.
> Swedish University of Agricultural Sciences
> Dep. of Ecology and Crop Production. Box 7043
> SE-750 07 Uppsala
> Sweden
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Successful installation of R 2.0.1 on SUSE 9.1

2005-01-20 Thread Min-Han Tan

Hi,

We managed to compile R 2.0.1 on 64-bit SUSE Linux 9.1 on a HP
Proliant setup fairly uneventfully by following instructions on the R
installation guide. We did encounter a minor hiccup in setting up x11,
a problem which we note has been raised 4 or 5 times previously, but
this was overcome thanks to this recent post by Peter Dalgaard on SUSE
9.1 and R.https://stat.ethz.ch/pipermail/r-help/2005-January/062397.html,
as well as previous comments on the mailing list.

One clarification on that post may be helpful: there are only 3
additional developmental packages required for successful X11
installation.

XFree86-devel-4.3.99.902-30
fontconfig-devel
freetype2-devel

These were not available in YAST (9.1 SUSE), but were located in

http://ftp.suse.com/pub/suse/ 

Once again, thanks to the assorted R gurus and wizards for making this
mailing list such a great resource.

Regards,
Min-Han Tan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] hclust and heatmap - slightly different dendrograms?

2004-12-15 Thread Min-Han Tan

Good afternoon,

I ran heatmap and hclust on the same matrix x (strictly, I ran
heatmap(x), and hclust(dist(t(x))), and realized that the two
dendrograms were slightly different, in that the left-right
arrangement of one pair of subclusters (columns) was reversed in the
two functions (but all individual columns were grouped correctly).

Looking through the code for heatmap as a most definite nonexpert, it
seems to me that hclust is also invoked by heatmap.

> heatmap
function (x, Rowv = NULL, Colv = if (symm) "Rowv" else NULL, 
distfun = dist, hclustfun = hclust, add.expr, symm = FALSE,
...

hcr <- hclustfun(distfun(x))
ddr <- as.dendrogram(hcr)


hcc <- hclustfun(distfun(if (symm) 
x
else t(x)))
ddc <- as.dendrogram(hcc)


I understand it is possible to add Rowv=NA and order the samples as
per hclust, but I'm just wondering if there is a reason for this
observation. Any pointers would be very much appreciated.

Thanks!

Min-Han Tan

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Re: Estimating survival?

2004-11-03 Thread Min-Han Tan

Apologies, I located the function in cph (... time.inc =30, surv = TRUE)
...
$surv.summary

Thanks.
Min-Han

On Tue, 2 Nov 2004 21:06:33 -0500, Min-Han Tan <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> Sorry to trouble the list. I have a problem which I'm not sure how to resolve.
> 
> I have a Cox model with 1 independent variable with 2 categories (and
> thus 2 survival curves on plotting survfit)
> 
> How can I get an estimate of survival for each category at a
> particular time point, with standard error?
> 
> Looking through ?cph and ?coxph, I'm not quite sure how to go about
> that. I would really appreciate any direction here.
> 
> Thanks a million!
> 
> Min-Han
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Estimating survival?

2004-11-02 Thread Min-Han Tan

Hi,

Sorry to trouble the list. I have a problem which I'm not sure how to resolve.

I have a Cox model with 1 independent variable with 2 categories (and
thus 2 survival curves on plotting survfit)

How can I get an estimate of survival for each category at a
particular time point, with standard error?
 
Looking through ?cph and ?coxph, I'm not quite sure how to go about
that. I would really appreciate any direction here.

Thanks a million!

Min-Han

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Fitting additional variables to model

2004-10-26 Thread Min-Han Tan

Good morning,

Sorry to trouble the list. I'm working on Cox models of survival, and
am encountering a problem. I'm trying to group variables into some
kind of new staging system  By grouping, I mean : so-called
'integrated staging systems' for cancer merge categories of variables
such as tumor stage, patient status into a single range of categories
e.g.
System I = Stage I or II, Patient Status 0
System II = Stage I or II, Patient Status 1
 OR
 Stage III, Patient Status 0
So in this example, Stage I + II are grouped together, probably based
on outcome.


So in the scenario where each of the initial 2 variables A and B
involved in the model have 4 categories:

1. Is there any other way to obtain a grouping the variables by
outcome besides examining all possible 16 Kaplan Meier curves
concurrently, and seeing how they group? Would it make sense to run
pairwise survfits - but if so, what happens when more variables are
introduced into the equation? Finally, is it possible to execute this
in R? Thanks!!!

2. If there is a significant interaction between these 2 terms (A*B),
does it even make sense to ask how I can perform "grouping" of the
variables?

Thanks in advance!

Min-Han

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Validating a Cox model on an external set

2004-09-28 Thread Min-Han Tan

Thank you all for your very insightful comments! And thank you for the
directions to the packages!

Re: non-statistical issues, yes, I was looking through Altman and
Royston's Stat Med 2000 article "What do we mean by validating a
prognostic model?" last night and it was very interesting. I'm working
on expression profiling in tumor samples, and there are several
difficulties in designing an experiment along those 'ideal'
guidelines.

A. Small sample size is of course the most common recurring problem,
with concomitant even lower event rate.

B. Patient recruitment issues are yet another issue, as many of the
samples are degraded after time - the older the sample, the more
degraded it usually is! So that biases selection towards recent cases.
Different centres have different storage techniques, resulting in
extensive degradation in samples from 1 centre and relatively intact
samples from others. So there is no choice but to perform "data-driven
selection" of cases - i.e. only samples which have good RNA.

Other problems I've encountered include:

C. Computational time. using a training sample size of 50 arrays,
running a full internal cross-validation of a model derived using
pamr.cv.cox took my computer about one and a half hours (with no other
process running). (P4, 3 GHz, 2 GB RAM, R 1.9.1., Windows XP) And
that's just *one* randomization!

Min-Han

On Tue, 28 Sep 2004 10:55:50 -0700, Berton Gunter
<[EMAIL PROTECTED]> wrote:
> 
> But note that there may be deeper, non-statistical, issues of what you mean
> by "validation" here: how good must the predictions be on the validation
> data? How similar or dissimilar should the validation data be to the
> "training" data? To what end/population is the fitted model to be applied?
> For example, AFAIK in most scientific research, a model is not considered
> "validated" unless results can be substantively reproduced (??) in different
> labs, sometimes with alternative methods.
> 
> Think of the 1916 (I think it was) measurements of star positions during a
> total solar eclipse to "validate" Einstein's Theory of General Relativity.
> My point is not to say that this kind of "validation" is appropriate for a
> Cox model, but only that the issues are worth thinking about.
> 
> -- Bert Gunter
> Genentech Non-Clinical Statistics
> South San Francisco, CA
> 
> "The business of the statistician is to catalyze the scientific learning
> process."  - George E. P. Box
> 
> 
> 
> 
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of Frank
> > E Harrell Jr
> > Sent: Tuesday, September 28, 2004 10:11 AM
> > To: Min-Han Tan
> > Cc: [EMAIL PROTECTED]
> > Subject: Re: [R] Validating a Cox model on an external set
> >
> > Min-Han Tan wrote:
> > > Good morning,
> > >
> > > Sorry to trouble the list.
> > >
> > > I have a problem I hope to seek your advice on.
> > >
> > > Essentially, I am trying to 'validate' a multivariate Cox
> > proportional
> > > hazards model built in a training set, by testing it on an external
> > > test set. I have performed a survfit using the Cox model to predict
> > > survival for the test set, and obtained individual predictions for
> > > survival time, with standard error for each test sample.
> > Each of these
> > > cases has an actual survival time, some censored.
> > >
> > > How can we decide whether the Cox model has been validated or not?
> >
> > This is what the Design package and its cph and validate.cph and
> > calibrate.cph functions are for.
> >
> > >
> > > I was suggested survdiff in the survival package, but survdiff works
> > > between curves; am not sure how I could use it (I have a predicted
> > > curve for each curve, but no 'observed curve' - the only observation
> > > is death or censoring at time x)
> > >
> > > Thank you all so much!
> > >
> > > Min-Han Tan
> > > Van Andel Institute
> > >
> > > __
> > > [EMAIL PROTECTED] mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> > >
> >
> >
> > --
> > Frank E Harrell Jr   Professor and Chair   School of Medicine
> >   Department of Biostatistics
> > Vanderbilt University
> > 
> > __
> 
> 
> > [EMAIL PROTECTED] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
> 
> 
> 
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Validating a Cox model on an external set

2004-09-28 Thread Min-Han Tan

Good morning,

Sorry to trouble the list. 

I have a problem I hope to seek your advice on. 
 
Essentially, I am trying to 'validate' a multivariate Cox proportional
hazards model built in a training set, by testing it on an external
test set. I have performed a survfit using the Cox model to predict
survival for the test set, and obtained individual predictions for
survival time, with standard error for each test sample. Each of these
cases has an actual survival time, some censored.
 
How can we decide whether the Cox model has been validated or not?
 
I was suggested survdiff in the survival package, but survdiff works
between curves; am not sure how I could use it (I have a predicted
curve for each curve, but no 'observed curve' - the only observation
is death or censoring at time x)

Thank you all so much! 
 
Min-Han Tan
Van Andel Institute

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Cox proportional hazards model

2004-09-22 Thread Min-Han Tan

Good afternoon,

I am currently trying to do some work on survival analysis.

- I hope to seek your advice re: 2 questions (1 general and 1 specific)

(1) I'm trying to do a stratified Cox analysis and subsequently
plot(survfit(object)). It seems to work for some strata, but not for
others.

I have tumor grade, which is a range of 1 - 4. 

When I divide this range of 1:4 into 2 groups, it works fine for
strata(grade>2) and strata(grade > 3). However, if I do a
strata(grade>1), there is an error when I do a survfit( 
)

Call: survfit.coxph(object = s)

Error in print.survfit(structure(list(n = as.integer(46), time = c(22,  : 
length of dimnames [1] not equal to array extent

(2) As a general question, is it possible to distinguish between
confounding and interaction in the Cox proportional hazards model?

Thanks!

Min-Han

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Re: brlr function

2004-08-25 Thread Min-Han Tan

Thank you all. 

The reason was because the dependent should have been 0 and 1, rather
than 1 and 2!

Apologies for any trouble.

Min-Han


On Wed, 25 Aug 2004 18:40:38 -0400, Min-Han Tan
<[EMAIL PROTECTED]> wrote:
> Hi,
> 
> I'm trying the brlr function in a penalized logistic regression function.
> 
> However, I am not sure why I am encountering errors. I hope to seek
> your advice here. (output below)
> 
> Thank you! Your help is truly appreciated.
> 
> Min-Han
> 
> #No error here, the glm seems to work fine
> > genes.cox1.glm1<-glm(as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+'
> 
> #Something happened here ... I only substituted brlr for glm
> > genes.cox1.glm1<-brlr(as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+'
> Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1
> 
> #using the test data in brlr gives the same error if the dependent is
> reduced to one column, is it a problem with my dependent? It works
> fine as per the original vignette with
> brlr(cbind(grahami,opalinus)~height ...)
> > brlr(grahami~height+diameter,data=lizards)
> Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1
> 
> #This is how the objects look
> > as.integer(sim.cv.cox.yhat1)
> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 2 2 2 1 1 2 2 1 1 1 2 1 2 2 2
> 1 2 1 1 2 1 1 2 1 1 1 2 2 2 2
> 
> > as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+')))
> as.integer(sim.cv.cox.yhat1) ~ sim.dat.tst[27, ] + sim.dat.tst[35,
>] + sim.dat.tst[17, ] + sim.dat.tst[41, ] + sim.dat.tst[38,
>] + sim.dat.tst[31, ] + sim.dat.tst[3, ] + sim.dat.tst[32,
>] + sim.dat.tst[16, ] + sim.dat.tst[48, ] + sim.dat.tst[13,
>] + sim.dat.tst[28, ]
> 
> > sim.dat.tst[27,]
> Sample 1  Sample 2  Sample 3  Sample 4  Sample 5  Sample 6  Sample 7
> Sample 8  Sample 9 Sample 10 Sample 11 Sample 12 Sample 13 Sample 14
> Sample 15
> 5.79  6.72  6.11  5.58  6.32  6.50  6.45
>5.77  7.46  6.32  5.64  5.77  5.44  5.83
>   5.57
> Sample 16 Sample 17 Sample 18 Sample 19 Sample 20 Sample 21 Sample 22
> Sample 23 Sample 24 Sample 25 Sample 26 Sample 27 Sample 28 Sample 29
> Sample 30
> 5.70  5.67  5.72  5.50  6.34  5.98  6.10
>6.25  6.13  7.40  7.28  8.37  5.73  6.83
>   5.72
> Sample 31 Sample 32 Sample 33 Sample 34 Sample 35 Sample 36 Sample 37
> Sample 38 Sample 39 Sample 40 Sample 41 Sample 42 Sample 43 Sample 44
> Sample 45
> 6.12  6.74  6.64  6.13  7.74  6.70  7.37
>6.54  6.49  5.75  6.18  6.41  7.68  5.36
>   6.34
> Sample 46 Sample 47 Sample 48
> 5.86  6.64  6.69
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] brlr function

2004-08-25 Thread Min-Han Tan

Hi,

I'm trying the brlr function in a penalized logistic regression function. 

However, I am not sure why I am encountering errors. I hope to seek
your advice here. (output below)

Thank you! Your help is truly appreciated.

Min-Han


#No error here, the glm seems to work fine
> genes.cox1.glm1<-glm(as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+'

#Something happened here ... I only substituted brlr for glm
 > genes.cox1.glm1<-brlr(as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+'
Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1

#using the test data in brlr gives the same error if the dependent is
reduced to one column, is it a problem with my dependent? It works
fine as per the original vignette with
brlr(cbind(grahami,opalinus)~height ...)
> brlr(grahami~height+diameter,data=lizards)
Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1
 
#This is how the objects look
> as.integer(sim.cv.cox.yhat1)
 [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 2 2 2 1 1 2 2 1 1 1 2 1 2 2 2
1 2 1 1 2 1 1 2 1 1 1 2 2 2 2

> as.formula(paste(paste('as.integer(sim.cv.cox.yhat1)~'),paste('sim.dat.tst[',genes.cox1.rows,',]',sep="",collapse='+')))
as.integer(sim.cv.cox.yhat1) ~ sim.dat.tst[27, ] + sim.dat.tst[35, 
] + sim.dat.tst[17, ] + sim.dat.tst[41, ] + sim.dat.tst[38, 
] + sim.dat.tst[31, ] + sim.dat.tst[3, ] + sim.dat.tst[32, 
] + sim.dat.tst[16, ] + sim.dat.tst[48, ] + sim.dat.tst[13, 
] + sim.dat.tst[28, ]
 
> sim.dat.tst[27,]
Sample 1  Sample 2  Sample 3  Sample 4  Sample 5  Sample 6  Sample 7 
Sample 8  Sample 9 Sample 10 Sample 11 Sample 12 Sample 13 Sample 14
Sample 15
 5.79  6.72  6.11  5.58  6.32  6.50  6.45 
5.77  7.46  6.32  5.64  5.77  5.44  5.83  
   5.57
Sample 16 Sample 17 Sample 18 Sample 19 Sample 20 Sample 21 Sample 22
Sample 23 Sample 24 Sample 25 Sample 26 Sample 27 Sample 28 Sample 29
Sample 30
 5.70  5.67  5.72  5.50  6.34  5.98  6.10 
6.25  6.13  7.40  7.28  8.37  5.73  6.83  
   5.72
Sample 31 Sample 32 Sample 33 Sample 34 Sample 35 Sample 36 Sample 37
Sample 38 Sample 39 Sample 40 Sample 41 Sample 42 Sample 43 Sample 44
Sample 45
 6.12  6.74  6.64  6.13  7.74  6.70  7.37 
6.54  6.49  5.75  6.18  6.41  7.68  5.36  
   6.34
Sample 46 Sample 47 Sample 48 
 5.86  6.64  6.69

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] A troubled state of freedom: generalized linear models where number of parameters > number of samples

2004-08-21 Thread Min-Han Tan

Good morning,

Thank you all for your help so far. I really appreciate it.

The crux of my problem is that I am generating a generalized linear
model with 1 dependent variable, approximately 50 training samples and
100 parameters (gene levels).

Essentially, if I have 100 genes and 50 samples, this results in
coefficients for the first 49 samples, and NAs for the rest, with an
ultra low residual deviance (usually approx. 10^-27). This seems to
have something to do with the number of degrees of freedom (since as
the number of genes increases up to 49, the number of residual degrees
of freedom drops to 0)

What kind of methods can I use to make sense of this? 

I have a subsequent set of samples to work on to validate the results
of this glm, so I am not sure if overfitting is really a problem.

Background: this is a microarray study, where I have divided the
samples in the training set into 2 groups, and generated a number of
genes to differentiate between both groups. I am going to use the GLM
in a subsequent regression analysis to determine survival. For this
purpose, I need to generate some kind of score for each individual
case using the coefficients of each gene level * gene expression
level.

I am not a statistician (but a clinician) - many apologies if I am not
conveying myself very clearly here!

Thanks. 

Min-Han Tan

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Loss of rownames and colnames

2004-08-20 Thread Min-Han Tan

Thank you all very much.

I have solved the problem using lists of matrices.

But yes, I will certainly look into the issue of the names for
higher-dimensional arrays.

Regards,
Min-han

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Loss of rownames and colnames

2004-08-20 Thread Min-Han Tan

Hi,

I am working on some microarray data, and have some problems with
writing iterations.

In essence, the problem is that objects with three dimensions don't
have rownames and colnames. These colnames and rownames would
otherwise still be there in 2 dimensional objects.

I need to generate multiple iterations of a 2 means-clustering
algorithm, and these objects thus probably need 3 dimensions.

My scripts are all written with heavy references to matching of
colnames and rownames, so I am running into some problems here.
(colnames = sample ids, and rownames = gene ids)

My bad workaround solution so far has been to generate objects tagged
with ".2", and have multiple blocks of code.

e.g. 

test.1 <- ...

test.2 <- ...

test.x <- ..

The obvious problem with this solution is that there does not seem to
be an easy way of manipulating all these objects together without
typing out their names individually.

Thanks for any advice.

Regards,
Min-Han

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Colors in survival plots

2004-07-25 Thread Min-Han Tan

Hi,

Sorry to trouble the list - I would like to ask a question - I can't
find the answer in the r-help archives.

I am trying to plot 2 survival curves in different colors.

What I have tried is this 
plot(survfit(sim.surv ~ sim.km.smpl),col="red")

but both the survival curves turn red... 

Your advice would be most appreciated! Thank you.

Min-Han

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Compiling R as a shared library

Re: [R] Postgraduate distance learning courses for MSc in Statistics

[R] Successful installation of R 2.0.1 on SUSE 9.1

[R] hclust and heatmap - slightly different dendrograms?

[R] Re: Estimating survival?

[R] Estimating survival?

[R] Fitting additional variables to model

Re: [R] Validating a Cox model on an external set

[R] Validating a Cox model on an external set

[R] Cox proportional hazards model

[R] Re: brlr function

[R] brlr function

[R] A troubled state of freedom: generalized linear models where number of parameters > number of samples

Re: [R] Loss of rownames and colnames

[R] Loss of rownames and colnames

[R] Colors in survival plots

16 matches

Site Navigation

Mail list logo

Footer information