[R] Annotate forest plot 'forest.rma()' for meta-analysis with metafor package

2012-07-24 Thread Jokel Meyer
Dear R-experts,

The forest.rma() function from the metafor package creates nice forest
plots for presenting the results of a meta-analysis. These plots can be
annotated for e.g. giving names to the columns. E.g. as in the
documentation of the package:

data(dat.bcg)

### meta-analysis of the log relative risks using a random-effects model
res <- rma(ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg, measure="RR",
   slab=paste(author, year, sep=", "), method="REML")

   forest(res, slab=paste(dat.bcg$author, dat.bcg$year, sep=", "),
   xlim=c(-16, 6), at=log(c(.05, .25, 1, 4)), atransf=exp,
   ilab=cbind(dat.bcg$tpos, dat.bcg$tneg, dat.bcg$cpos, dat.bcg$cneg),
   ilab.xpos=c(-9.5,-8,-6,-4.5), cex=.75)
op <- par(cex=.75, font=2)
text(c(-9.5,-8,-6,-4.5), 15, c("TB+", "TB-", "TB+", "TB-"))
text(c(-8.75,-5.25), 16, c("Vaccinated", "Control"))
text(-16,15, "Author(s) and Year", pos=4)
text(6,  15, "Relative Risk [95% CI]", pos=2)
par(op)


However the column names have to be re-positioned everytime the number of
included studies in the meta-analysis changes.
E.g.:

dat.bcg <- dat.bcg[1:12,]

### meta-analysis of the log relative risks using a random-effects model
res <- rma(ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg, measure="RR",
   slab=paste(author, year, sep=", "), method="REML")

forest(res, slab=paste(dat.bcg$author, dat.bcg$year, sep=", "),
   xlim=c(-16, 6), at=log(c(.05, .25, 1, 4)), atransf=exp,
   ilab=cbind(dat.bcg$tpos, dat.bcg$tneg, dat.bcg$cpos, dat.bcg$cneg),
   ilab.xpos=c(-9.5,-8,-6,-4.5), cex=.75)
op <- par(cex=.75, font=2)
text(c(-9.5,-8,-6,-4.5), 15, c("TB+", "TB-", "TB+", "TB-"))
text(c(-8.75,-5.25), 16, c("Vaccinated", "Control"))
text(-16,15, "Author(s) and Year", pos=4)
text(6,  15, "Relative Risk [95% CI]", pos=2)
par(op)

Is there a way to define the colum names' position relatively to the plot
so they would be adequately positioned no matter what is the number of
studies included in the plot?


Many thanks for your help!,
Jokel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Power calculation using pwr.t.test()

2012-06-24 Thread Jokel Meyer
Dear R experts,

I have conducted a power calculation in order to estimate the number of
subjects needed to detect an effect size of d=0.28 (cohen's d) for a
difference between two independent groups (alpha level should be 0.05 and
the effect should be detected with 80% probability).
The results from the code below indicates that I would need n=400 subjects
(200 in each group). This is seems so incredibly high that I mistrust my
results & wanted to ask whether I miscalculated n?

library(pwr)
pwr.t.test(d=0.28,sig.level=0.05,power=0.8,type="two.sample",alternative="two.sided")

Many thanks for you help!
Jokel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] correct implementation of a mixed-model ANOVA in R

2012-04-15 Thread Jokel Meyer
Dear R-experts!

I having trouble with the correct implementation of a mixed-model ANOVA in
R.

I my dataset subjects were tested in a cognitive performance test
(numerical outcome variable 'score'). This cognitive performance test is
devided into five blocks (categorical factor 'block').  All subjects were
tested two times (in random order once following placebo treatment and once
following drug treatment: categorical factor 'treatment'). In order to
account for re-test effects I coded the testing session (categorical factor
'session' with 1 coding for the first and 2 coding for the second session).
Also all subjects once underwent a blood screening before the study in
which a stable blood marker was measured which is hypothesized to mediate
treatment effects (covariate 'blood').

I would like to look for treatment effects on my outcome variable 'score'
and whether the treatment effect is mediated by the covariate 'blood'. Also
I would like to use 'subject' as a random factor, include 'block' as a
within-subjects factor and control for re-test effects by including
'session'.

Could you advice me regarding the proper implementation of my model? Also I
am happy about any pointers to R-related literature.
Please find below some dummy data and initially model formulation.

Many thanks & best wishes,
Jokel

# some dummy data
data <- structure(list(subject = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L,
4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("1", "2", "3", "4",
"5"), class = "factor"), block = structure(c(1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("1", "2",
"3"), class = "factor"), treatment = structure(c(2L, 2L, 2L,
1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("drug",
"placebo"), class = "factor"), session = structure(c(1L, 1L,
1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L,
1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("1st",
"2nd"), class = "factor"), blood = c(3.04, 3.04, 3.04, 3.04,
3.04, 3.04, 4.93, 4.93, 4.93, 4.93, 4.93, 4.93, 5.65, 5.65, 5.65,
5.65, 5.65, 5.65, 5.7, 5.7, 5.7, 5.7, 5.7, 5.7, 5.77, 5.77, 5.77,
5.77, 5.77, 5.77), score = c(10.98, 11.66, 9.99, 9.15, 9.9, 10.3,
10.78, 9.43, 8.6, 9.67, 10.75, 9.09, 11.18, 10.33, 8.61, 10.04,
9.13, 8.71, 10.03, 9.38, 11.17, 9.82, 10.07, 8.34, 9.94, 11.38,
9.19, 8.35, 10.17, 9.04)), .Names = c("subject", "block", "treatment",
"session", "blood", "score"), row.names = c(NA, -30L), class = "data.frame")

# some first try which however throws a warning
aov1 <-
aov(score~blood*treatment*session*block+Error(subject/(treatment*session*block)),
data=data)

summary(aov1)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using dcast with multiple functions to aggregate

2012-04-12 Thread Jokel Meyer
Dear R communitiy,

I am trying to use multiple functions for aggregation within a function
call for dcast. However this seems to result in an error. Also I have not
managed to make dcast() work with fun.aggregate=sd. Please find attached
some example code using the ChickWeight data.

Many thanks for your help!
Jokel


#Chick weight example

names(ChickWeight) <- tolower(names(ChickWeight))


sd(ChickWeight$weight) # works fine

mean(ChickWeight$weight) # works fine

length(ChickWeight$weight) # works fine


chick_m <- melt(ChickWeight, id=2:4, na.rm=TRUE)

dcast(chick_m, time~variable, mean) # works fine

dcast(chick_m, time~variable, length) # works fine


dcast(chick_m, time~variable, fun.aggregate=sd) # gives an error

dcast(chick_m, time~variable, c(mean, length)) # gives an error

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Embed R code in online database

2012-02-11 Thread Jokel Meyer
Dear R-List!

I would like to embed R code in an online database such as i.e. a google
spreadsheet in way that users can add data to the database and that R's
calculations are updated automatically and i.e. given out in the
spreadsheet. Maybe even graphs could be updated online? Is there a way to
implement this?

Many thanks!
J.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Workflow for binary classification problem using svm via e1071 package

2011-09-27 Thread Jokel Meyer
Dear R-list!
I am using the e1071 package in R to solve a binary classification problem
in a dataset of round 180 predictor variables (blood metabolites) of two
groups of subjects (patients and healthy controls). I am confused regarding
the correct way to assess the classification accuracy of the trained svm.
(A) The svm command allows to specificy via the 'cross=k' parameter to
specify a k-fold crossvalidation. This results in k values for
classification accuracy and their corresponding mean. (B) On the other hand
most textbooks and tutorials I was browsing, recommend separating the data
into a training and a test-set and then to assess prediction accuarcy by
checking the accuracy of the trained svm when applied to the test-set.
I am not sure whether (A) and (B) would be alternative ways to assess
prediction accuracy? Or is option (A) only the accuracy of the svm when
applied to the test set and therefore I should implement option (B) after I
used option (A)?
So would it be the correct way to use first (A) then do grid-search (via the
tune command) to find the best hyperparameters and then test the trained svm
by applying it to the test set? And in case I use a linear kernel instead of
RBF, I guess I do not need to run grid-search as there are no
hyperparameters to estimate?

BEst,
Jokel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assessing prediction performance of SVM using e1071 package

2011-09-24 Thread Jokel Meyer
Dear R-Users!

I am using the svm function (e1071 package) to classify two groups using a
set of 180 indicator variables. Now I am confused about the cross-validation
procedure.

(A) On one hand I use the setting cross=10 in the svm function to run 10
cross-validation iterations and to get an estimate of the svm's performance
in prediction.

(B) On the other hand most tutorials I found recommended separating the set
into two sets using one set for training of the svm and the other for
testing the predictive power of the trained svm.

My understanding would be that (A) and (B) are alternative ways to estimate
the performance of the svm. Or would I have to implement both?

Many thanks for your help!
Jokel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Main-effect of categorical variables in meta-analysis (metafor)

2011-08-05 Thread Jokel Meyer
Dear R-experts!

In a meta-analysis (metafor) I would like to assess the effect of two
categorical covariates (A & B) whereas they both have 4 levels.
Is my understanding correct that this would require to dummy-code (0,1) each
level of each covariate (A & B)?
However I am interested in the main-effects and the interaction of these two
covariates and the dummy-coding would only allow to detect the effect of one
level of one factor. Would there be a way to assess main-effects and
interactions (something like an meta-analysis-ANOVA)?

Many thanks,
Jokel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using NCBI E-Utilities in R to extract data from PubMed

2011-08-05 Thread Jokel Meyer
Dear R-experts!

I found a great post on the "getting genetics done" blog which describes how
to make use of NCBI E-utilities in Python to extract data from PubMed:
http://gettinggeneticsdone.blogspot.com/2011/05/using-ncbi-e-utilities.html

I was wondering whether there would be a way to directly make use of
the NBCI E-utilities from R? It would be amazing to e.g. extract the number
of articles by an author? Or to investigate co-authorship networks?

Many thanks for the help!
Jokel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Converting F-value from ANOVA to cohen's d in meta-analysis (metafor-package)

2011-07-27 Thread Jokel Meyer
Dear R-experts!

Running a meta-analysis (using the magnificent metafor-package), I use
cohen's d as a main outcome measure in a random-effects model.
For most of the samples cohen's d is derived form a comparison of two groups
(A & B). However some studies report results from an ANOVA (one-factor with
three levels: C,D,E) whereas two groups (C,D) correspond to one group in the
other studies (B=C,D). Is there a way?
The handbook of research synthesis and meta-analysis By Harris M. Cooper
says that:
d=sqrt((F*(n1+n2)/n1*n2)) for ANOVA (one-factor with two-levels)
but does this also hold for ANOVA (one-factor with three levels)?

Many thanks for your help!
Jokel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract confidence intervals from rma object (metafor package)

2011-07-18 Thread Jokel Meyer
Dear R-experts!

I am working on some meta-analysis using the metafor package. I would like
to extract values of the confidence intervals of the effect sizes of the
single studies from an rma object. Those values are printed out when
plotting a forest plot using the forest function on the rma object, however
I was not able to locate them.

Many thanks for your help!
Jokel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Estimate average standard deviation of mean of two dependent groups

2010-08-25 Thread Jokel Meyer
Dear R-experts!

I am currently running a meta-analysis with the help of the great metafor
package. However I have some difficulties setting up my raw data to enter it
to the meta-analysis models.
I have a group of subjects that have been measured in two continuous
variables (A & B). I have the information about the mean of the two
variables, the group size (n) and the standard deviations of the two
variables.
Now I would like to average both variables (A & B) and get the mean and
standard deviation of the merged variable (C).
As for the mean this would be quiet easy: I would just take the mean of mean
A and mean B to get the mean of C.
However for the standard deviation this seems more tricky as it is to assume
that standard deviations in A & B correlate. I assume (based on further
analysis) a correlation of r =0.5.
I found the formula to get the standard deviation of the SUM (not the mean)
of two variables:
SD=SQRT(SD_A^2 + SD_B^2 + 2*r*SD_A*SD_B)

with SD_B and SD_B being the standard deviation of A and B. And r*SD_A*SD_B
being the covariance of A and B.

Would this formula also be valid if I want to average (and not sum) my two
variables?

Many thanks for any help & best wishes,
Jokel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.