RE:[tips] PSYC failed the replication test

Mike Palij Fri, 28 Aug 2015 12:21:21 -0700

On Fri, 28 Aug 2015 11:18:21 -0700, Rick Froman wrote:

It is true that we only got a D+ :( but 68% is quite a bit higher :)
than some of the estimates of reproducibility I have heard
bandied about (which are close to zero).


Rick, I think you are manifesting a positivity bias. ;-)  That is,
I think you use the wrong number (68%) to represent the results.
Below I quote the results section from the "structured abstract"
of the Science article linked to by Karl:

|We conducted replications of 100 experimental and correlational
|studies published in three psychology journals using high-powered
|designs and original materials when available. There is no single
|standard for evaluating replication success. Here, we evaluated
|reproducibility using significance and P values, effect sizes,
|subjective assessments of replication teams, and meta-analysis
|of effect sizes. The mean effect size (r) of the replication effects
|(Mr = 0.197, SD = 0.257) was half the magnitude of the mean
|effect size of the original effects (Mr = 0.403, SD = 0.188),
|representing a substantial decline. Ninety-seven percent of original
|studies had significant results (P < .05). Thirty-six percent of
|replications had significant results; 47% of original effect sizes
|were in the 95% confidence interval of the replication effect size;
|39% of effects were subjectively rated to have replicated the

|original result; and if no bias in original results is assumed,combining|original and replication results left 68% with statisticallysignificant

|effects. Correlational tests suggest that replication success was
|better predicted by the strength of original evidence than by
|characteristics of the original and replication teams.

The key points above, IMHO, are:

(1) 97% of the original studies had statistically significant results

while only 36% of the replications had statistically significantresults.

In other words, only (36/97 * 100=) 37% of the original significant
results were replicated (assuming none of the original nonsignificant
results became significant on replication).

(2) The original studies Effect Size (ES), represented by Pearson r,
had a mean r = 0.403 while the replication ES had a mean r= 0.197.
The replicated ES was less than half of the original ES.

(3) By subjective evaluations, only 39% of the original effects were
replicated.

The full article is available at the following URL:
http://www.sciencemag.org/content/349/6251/aac4716.full

The suggested citation for the article is:

Open Science Collaboration (28 August 2015). Estimating the
reproducibility of psychological science. Science : 349 (6251),
aac4716 [DOI:10.1126/science.aac4716]

Brian Nosek is identified as the corresponding author and there
appears to be about 70+/-20 other authors (I didn't count) representing
125 institutions (Science Mag did the counting).

So, 68% is one way to represent the results or one could say that
only about 39% of the original findings were replicated.  YMMV.

-Mike Palij
New York University
[email protected]

P.S. I have skimmed the full article and what I refer to above as
Pearson r may be what the authors call "Spearman's rank-order
correlations of reproducibility".  Have to read the full article to
determine what the difference is.

------------Original Message----------------
On Friday, August 28, 2015 12:59 PM, Karl Louis Wuensch wrote:

This article is going on the reading list for my gradstudents
in stats.  Once we have covered power and publication biases this
article should, I hope, lead to some lively discussion.
http://www.sciencemag.org/content/349/6251/aac4716



---
You are currently subscribed to tips as: [email protected].
To unsubscribe click here: 
http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5&n=T&l=tips&o=46471
or send a blank email to 
leave-46471-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu

RE:[tips] PSYC failed the replication test

Reply via email to