[R] multiple testing in survival

2008-03-14 Thread darteta001
Dear list, I am willing to perform a cox PH regression analysis on my "time to reach a severity score of 5". I have 600 observations and 50 predictor variables. So I first want to perform a log-rank test (using the survdiff() function) on each individual variable in order to select a subset o

[R] Design�s validate() output

2008-03-11 Thread darteta001
Dear list Is there anywhere I could find further information on how to interpret the output for a logistic regression for validate() from Design package?. I tried ?validate and google but I cannot find information on what the rows and the columns represent. Thanks David

Re: [R] Cox model

2008-02-12 Thread darteta001
Dear Eleni, from a previous post regarding maximum number of variables in a multiple linear regression analysis, posted last tuesday, and I think it can be relevant also to Cox PH models: "I can think of no circumstance where multiple regression on "hundreds of thousands of variables" is anythi

Re: [R] Cox model

2008-02-12 Thread darteta001
Hi Eleni, I am not an expert in R or statistics but in my opinion you have too many regressors compared to the number of observations and that might be the reason why you get the error. Others might say better but as far as I know, having only 80 observations, it is a good idea to first filter

[R] correlation

2008-02-08 Thread darteta001
Dear list I would like to compare two measurements of disease severity (M1 and M2), one of the is continuous (M1 ranging from 1 to 10) and the other is ordinal (M2 takes Low, Medium, high and very high). Do you think is ok to use cor() function to test whether the two agree, i.e correlate? I a

[R] summary of categorical variables

2008-01-21 Thread darteta001
Dear list, I have a data.frame with nine categorical variables (0,1,2 and NAs) that I would like to get the number of events for each of them. I can extract this using summary() for each variable at a time with the as.factor()argument (otherwise it will get me the mean value): >summary(as.fact

[R] histogram with NAs

2008-01-18 Thread darteta001
Dear list, I have a categorical variable in a data.frame that I would like to plot using a histogram to show number of events. Values are 0, 1 and some NAs. I can´t make the hist() function to 1) include a column with the number of NAs 2) have the x axis to be categorical, I always get 0, 0.2,

[R] FDR for hypergeometric tests

2008-01-15 Thread darteta001
Dear list, I have performed several tests for the hypergeometric distribution using phyper() for some gene annotation categories as follows >phyper(26,830,31042,337, lower.tail=F) >phyper(16,387,31042,337, lower.tail=F) . . . I am only running some selected categories but I would like to cor

Re: [R] Reproducibility of experiment

2007-12-17 Thread darteta001
Dear Marc and R-list, thanks for your help. I have checked Bland-Altman help page about repeatability, and I learnt that instead of reproducibility, I was talking about repeatability. Although I am not sure whether they only focuse on agreement of two different measurement methods, and not

[R] Reproducibility of experiment

2007-12-12 Thread darteta001
Dear list, I have an experiment that I have run 10 times in order to find out its reproducibility. I wonder if there is any function that I can use for obtaining a significance value of reproducibility or agreement of measurements. I thought of coefficient of variation but, as far as I know, I

[R] reference for logistic regression

2007-10-11 Thread darteta001
Dear list, first accept my apologies for asking a non-R question. Can anyone point me to a good reference on logistic regression? web or book references would be great. I am interested in the use and interpretation of dummy variables and prediction models. I checked the contributed section in th

[R] trimmed mean

2007-10-05 Thread darteta001
Dear list, I would like to calculate the trimmed mean for some subsets within a data.frame (data). The list has already been very helpful, and I have managed calculating the standard mean for the "Intensity" column based on the "Name" column using > avgs <-aggregate(data$Intensity, by = list(

Re: [R] mean of subset of rows

2007-10-02 Thread darteta001
Thankyou all for your answers, I have decided using aggregate() but I will keep in mind tapply(). I was wondering if it is possible to tell aggregate to use two functions at the same time, i.e., mean() and sd (), or is it better to call aggregate() two times, one for mean, and another for sd and

[R] mean of subset of rows

2007-10-01 Thread darteta001
Dear list, this must be an easy one: I have a data.frame of two columns, "ID" with four different levels (A to D) and numerical "size", and each of the 4 different IDs is repeated a different number of times. I would like to get the mean size for each ID as another data.frame. I have tried th

[R] Comparing regression models

2007-09-14 Thread darteta001
Dear list, I am interested in comparing two linear regression models to see if including one extra variable improves the model significantly. I have read that one possibility is doing an F test on the goodness-of-fit values for both models, and another option that is comparing the residuals o

[R] k-means clustering

2007-09-12 Thread darteta001
Dear list, first apologies for this is not strictly an R question but a theoretical one. I have read that use of k-means clustering assumes sphericity of data distribution. Can anyone explain me what this means? My statistical background is too poor. Is it another kind of distribution, like g