Dear list,
I am willing to perform a cox PH regression analysis on my "time to
reach a severity score of 5". I have 600 observations and 50 predictor
variables. So I first want to perform a log-rank test (using the
survdiff() function) on each individual variable in order to select a
subset o
Dear list
Is there anywhere I could find further information on how to interpret
the output for a logistic regression for validate() from Design
package?. I tried ?validate and google but I cannot find information
on what the rows and the columns represent.
Thanks
David
Dear Eleni,
from a previous post regarding maximum number of variables in a
multiple linear regression analysis, posted last tuesday, and I think
it can be relevant also to Cox PH models:
"I can think of
no circumstance where multiple regression on "hundreds of thousands of
variables" is anythi
Hi Eleni,
I am not an expert in R or statistics but in my opinion you have too
many regressors compared to the number of observations and that might
be the reason why you get the error. Others might say better but as
far as I know, having only 80 observations, it is a good idea to first
filter
Dear list
I would like to compare two measurements of disease severity (M1 and
M2), one of the is continuous (M1 ranging from 1 to 10) and the other
is ordinal (M2 takes Low, Medium, high and very high). Do you think is
ok to use cor() function to test whether the two agree, i.e correlate?
I a
Dear list,
I have a data.frame with nine categorical variables (0,1,2 and NAs)
that I would like to get the number of events for each of them. I can
extract this using summary() for each variable at a time with the
as.factor()argument (otherwise it will get me the mean value):
>summary(as.fact
Dear list,
I have a categorical variable in a data.frame that I would like to
plot using a histogram to show number of events. Values are 0, 1 and
some NAs. I can´t make the hist() function to
1) include a column with the number of NAs
2) have the x axis to be categorical, I always get 0, 0.2,
Dear list,
I have performed several tests for the hypergeometric distribution
using phyper() for some gene annotation categories as follows
>phyper(26,830,31042,337, lower.tail=F)
>phyper(16,387,31042,337, lower.tail=F)
.
.
.
I am only running some selected categories but I would like to cor
Dear Marc and R-list,
thanks for your help. I
have checked Bland-Altman help
page about repeatability, and I learnt that instead of
reproducibility,
I was talking about repeatability. Although I am not sure whether they
only
focuse on agreement of two different measurement methods, and not
Dear list,
I have an experiment that I have run 10 times in order to find out its
reproducibility. I wonder if there is any function that I can use for
obtaining a significance value of reproducibility or agreement of
measurements. I thought of coefficient of variation but, as far as I
know, I
Dear list, first accept my apologies for asking a non-R question.
Can anyone point me to a good reference on logistic regression? web or
book references would be great. I am interested in the use and
interpretation of dummy variables and prediction models.
I checked the contributed section in th
Dear list,
I would like to calculate the trimmed mean for some subsets within a
data.frame (data). The list has already been very helpful, and I have
managed calculating the standard mean for the "Intensity" column based
on the "Name" column using
> avgs <-aggregate(data$Intensity, by = list(
Thankyou all for your answers, I have decided using aggregate() but I
will keep in mind tapply(). I was wondering if it is possible to tell
aggregate to use two functions at the same time, i.e., mean() and sd
(), or is it better to call aggregate() two times, one for mean, and
another for sd and
Dear list,
this must be an easy one:
I have a data.frame of two columns, "ID" with four different levels (A
to D) and numerical "size", and each of the 4 different IDs is
repeated a
different number of times. I would like to get the mean size for each
ID as another data.frame. I have tried th
Dear list,
I am interested in comparing two linear regression models to see if
including one extra variable improves the model significantly. I have
read that one possibility is doing an F test on the goodness-of-fit
values for both models, and another option that is comparing the
residuals o
Dear list, first apologies for this is not strictly an R question but
a theoretical one.
I have read that use of k-means clustering assumes sphericity of data
distribution. Can anyone explain me what this means? My statistical
background is too poor. Is it another kind of distribution, like
g
16 matches
Mail list logo