Re: [O] A manuscript on "reproducible research" introducing org-mode
t...@tsdye.com (Thomas S. Dye) writes: > I just ran across this article on reproducible research that some of you > might find interesting. > > http://journal.r-project.org/archive/2011-2/RJournal_2011-2_Lundholm.pdf On reproducible research, are you guys aware of the relatively recent project Knitr? Basically, it is a new Sweave which integrate (i) Sweave, (ii) TiKZDevice, (iii) cacheSweave, and (iv) code highlight into one very well-functioning package. It kind of works with Org, but not ideally¹. It might be nice to integrate it closer with Babel-R, as it just-worksᵀᴹ. –Rasmus Footnotes: ¹ http://yihui.name/knitr/demo/org/ -- Enought with the bla bla!
Re: [O] A manuscript on "reproducible research" introducing org-mode
Samuel Wales writes: > As a followup to my last comment, this explains how Stapel > fooled almost everybody and kept raw data hidden: > > > http://chronicle.com/blogs/percolator/the-fraud-who-fooled-almost-everyone/27917 > > And NYT "Fraud Case Seen as a Red Flag for Psychology > Research" which has a raw data take: > > > http://www.nytimes.com/2011/11/03/health/research/noted-dutch-psychologist-stapel-accused-of-research-fraud.html > > Thanks for the videos, Stephen, I will check them out. > > I have been running across scads of fraud stories and interesting > studies on conflict of interest, reliability of research results, etc. > It's all over the place, just scattered and nobody pays much > attention, perhaps not wanting to believe it. > > Reproducible research aims directly at this stuff. Chapeau! > > Samuel I just ran across this article on reproducible research that some of you might find interesting. http://journal.r-project.org/archive/2011-2/RJournal_2011-2_Lundholm.pdf All the best, Tom -- Thomas S. Dye http://www.tsdye.com
Re: [O] A manuscript on "reproducible research" introducing org-mode
As a followup to my last comment, this explains how Stapel fooled almost everybody and kept raw data hidden: http://chronicle.com/blogs/percolator/the-fraud-who-fooled-almost-everyone/27917 And NYT "Fraud Case Seen as a Red Flag for Psychology Research" which has a raw data take: http://www.nytimes.com/2011/11/03/health/research/noted-dutch-psychologist-stapel-accused-of-research-fraud.html Thanks for the videos, Stephen, I will check them out. I have been running across scads of fraud stories and interesting studies on conflict of interest, reliability of research results, etc. It's all over the place, just scattered and nobody pays much attention, perhaps not wanting to believe it. Reproducible research aims directly at this stuff. Chapeau! Samuel -- The Kafka Pandemic: http://thekafkapandemic.blogspot.com
Re: [O] A manuscript on "reproducible research" introducing org-mode
Samuel Wales writes: > I applaud all of this. Raw data need to be made available by default > (with only a few exceptions). Org can help people reproduce all of > the succeeding steps also. Some people on the list might like to see the short (13 min) segment on Duke University's recent problems with reproducible research http://www.cbsnews.com/video/watch/?id=7398476n&tag=contentMain;contentAux and the heroic efforts to uncover what had been done (37 min): http://videolectures.net/cancerbioinformatics2010_baggerly_irrh/ Stephen
Re: [O] A manuscript on "reproducible research" introducing org-mode
Hello Jambunathan, The ODT version was prepared "by hand" using LibreOffice. This was written (last May) before your org-odt functions became part of org-mode (if I'm right). I would now also do it with org-mode. Christophe Jambunathan K writes: > Christophe > > I see an ODT file in there - LFPdetection_in.odt > http://hal.archives-ouvertes.fr/hal-00591455/ > > May I ask how the document was produced. > > Do you have any insights on how the Org's ODT exporter performs wrt your > input Org file. Just curious. > >> @article{Delescluse2011, >> title = "Making neurophysiological data analysis reproducible: Why and how?", >> journal = "Journal of Physiology-Paris", >> volume = "", >> number = "0", >> pages = " - ", >> year = "2011", >> note = "", >> issn = "0928-4257", >> doi = "10.1016/j.jphysparis.2011.09.011", >> url = "http://www.sciencedirect.com/science/article/pii/S0928425711000374";, >> author = "Matthieu Delescluse and Romain Franconville and Sébastien Joucla >> and Tiffany Lieury and Christophe Pouzat", >> keywords = "Software", >> keywords = "R", >> keywords = "Emacs", >> keywords = "Matlab", >> keywords = "Octave", >> keywords = "LATEX", >> keywords = "Org-mode", >> keywords = "Python", >> abstract = "Reproducible data analysis is an approach aiming at >> complementing classical printed scientific articles with everything required >> to independently reproduce the results they present. “Everything” covers >> here: the data, the computer codes and a precise description of how the code >> was applied to the data. A brief history of this approach is presented >> first, starting with what economists have been calling replication since the >> early eighties to end with what is now called reproducible research in >> computational data analysis oriented fields like statistics and signal >> processing. Since efficient tools are instrumental for a routine >> implementation of these approaches, a description of some of the available >> ones is presented next. A toy example demonstrates then the use of two open >> source software programs for reproducible data analysis: the “Sweave family” >> and the org-mode of emacs. The former is bound to R while the latter can be >> used with R, Matlab, Python and many more “generalist” data processing >> software. Both solutions can be used with Unix-like, Windows and Mac >> families of operating systems. It is argued that neuroscientists could >> communicate much more efficiently their results by adopting the reproducible >> research paradigm from their lab books all the way to their articles, thesis >> and books." >> } -- Most people are not natural-born statisticians. Left to our own devices we are not very good at picking out patterns from a sea of noisy data. To put it another way, we are all too good at picking out non-existent patterns that happen to suit our purposes. Bradley Efron & Robert Tibshirani (1993) An Introduction to the Bootstrap -- Christophe Pouzat MAP5 - Mathématiques Appliquées à Paris 5 CNRS UMR 8145 45, rue des Saints-Pères 75006 PARIS France tel: +33142863828 mobile: +33662941034 web: http://www.biomedicale.univ-paris5.fr/physcerv/C_Pouzat.html
Re: [O] A manuscript on "reproducible research" introducing org-mode
Christophe I see an ODT file in there - LFPdetection_in.odt http://hal.archives-ouvertes.fr/hal-00591455/ May I ask how the document was produced. Do you have any insights on how the Org's ODT exporter performs wrt your input Org file. Just curious. > @article{Delescluse2011, > title = "Making neurophysiological data analysis reproducible: Why and how?", > journal = "Journal of Physiology-Paris", > volume = "", > number = "0", > pages = " - ", > year = "2011", > note = "", > issn = "0928-4257", > doi = "10.1016/j.jphysparis.2011.09.011", > url = "http://www.sciencedirect.com/science/article/pii/S0928425711000374";, > author = "Matthieu Delescluse and Romain Franconville and Sébastien Joucla > and Tiffany Lieury and Christophe Pouzat", > keywords = "Software", > keywords = "R", > keywords = "Emacs", > keywords = "Matlab", > keywords = "Octave", > keywords = "LATEX", > keywords = "Org-mode", > keywords = "Python", > abstract = "Reproducible data analysis is an approach aiming at complementing > classical printed scientific articles with everything required to > independently reproduce the results they present. “Everything” covers here: > the data, the computer codes and a precise description of how the code was > applied to the data. A brief history of this approach is presented first, > starting with what economists have been calling replication since the early > eighties to end with what is now called reproducible research in > computational data analysis oriented fields like statistics and signal > processing. Since efficient tools are instrumental for a routine > implementation of these approaches, a description of some of the available > ones is presented next. A toy example demonstrates then the use of two open > source software programs for reproducible data analysis: the “Sweave family” > and the org-mode of emacs. The former is bound to R while the latter can be > used with R, Matlab, Python and many more “generalist” data processing > software. Both solutions can be used with Unix-like, Windows and Mac families > of operating systems. It is argued that neuroscientists could communicate > much more efficiently their results by adopting the reproducible research > paradigm from their lab books all the way to their articles, thesis and > books." > } --
Re: [O] A manuscript on "reproducible research" introducing org-mode
Aloha Tom, Not yet in print, still on the accepted papers list (http://www.sciencedirect.com/science/journal/aip/09284257), sorry. It seems that I chose the "slowest" neuroscience journal! Your JSS paper of last month (with Eric, Dan and Carsten) is great by the way. It seems that I missed the announcements on the list when the pre-print was posted, otherwise I would have managed to cite it in mine. The bibtex entry for my paper (just downloaded from Elsevier site) is: @article{Delescluse2011, title = "Making neurophysiological data analysis reproducible: Why and how?", journal = "Journal of Physiology-Paris", volume = "", number = "0", pages = " - ", year = "2011", note = "", issn = "0928-4257", doi = "10.1016/j.jphysparis.2011.09.011", url = "http://www.sciencedirect.com/science/article/pii/S0928425711000374";, author = "Matthieu Delescluse and Romain Franconville and Sébastien Joucla and Tiffany Lieury and Christophe Pouzat", keywords = "Software", keywords = "R", keywords = "Emacs", keywords = "Matlab", keywords = "Octave", keywords = "LATEX", keywords = "Org-mode", keywords = "Python", abstract = "Reproducible data analysis is an approach aiming at complementing classical printed scientific articles with everything required to independently reproduce the results they present. “Everything” covers here: the data, the computer codes and a precise description of how the code was applied to the data. A brief history of this approach is presented first, starting with what economists have been calling replication since the early eighties to end with what is now called reproducible research in computational data analysis oriented fields like statistics and signal processing. Since efficient tools are instrumental for a routine implementation of these approaches, a description of some of the available ones is presented next. A toy example demonstrates then the use of two open source software programs for reproducible data analysis: the “Sweave family” and the org-mode of emacs. The former is bound to R while the latter can be used with R, Matlab, Python and many more “generalist” data processing software. Both solutions can be used with Unix-like, Windows and Mac families of operating systems. It is argued that neuroscientists could communicate much more efficiently their results by adopting the reproducible research paradigm from their lab books all the way to their articles, thesis and books." } I will post on the list the "official" bibliographic reference as soon as the paper is in print. Take care, Christophe t...@tsdye.com (Thomas S. Dye) writes: > Aloha Christophe, > > Has this article appeared in print? If so, can you forward publication > details? > > All the best, > Tom > > Christophe Pouzat writes: > >> "Thomas S. Dye" a écrit : >> >>> Christophe Pouzat writes: >>> Dear all, M. Delescluse, R. Franconville, S. Joucla, T. Lieury and myself (C. Pouzat) have just put a manuscript entitled: "Making neurophysiological data analysis reproducible. Why and how?" on a pre-print server: http://hal.archives-ouvertes.fr/hal-00591455/fr/ Although the paper has been written for a neurobiological journal, the reader does not have to be a neuroscientist to read and understand it. A toy example illustrating the use of org-mode + Babel (with Python and Octave) takes a fair part of the manuscript. Other tools like R + Sweave are presented and many more are mentioned. I thank Eric Schulte for comments on the manuscript and Eric (again) together with the whole org-mode / Babel community for developing such a great tool. Any comment, remark, suggestion on the manuscript is of course welcome. Christophe >> >>> Aloha Christophe, >>> >>> Thank you for an interesting and useful paper. I was happy with the >>> distinction you draw between reproducible analysis and reproducible >>> research, which certainly applies to my field of archaeology where >>> unique sites are typically destroyed by the data collection effort. I >>> also think the emphasis you place on data preprocessing is just the >>> right approach; inclusion of the raw data in a reproducible analysis >>> opens up many possibilities, which must be a benefit to a scientific >>> community's pursuit of knowledge. >>> >>> May I offer a suggestion? Carsten Dominik published the Org Mode 7 >>> Manual last year and it would be nice to see it cited in your paper. >>> >>> @book{dominik10:_org_mode_refer_manual, >>> author = {Carsten Dominik}, >>> title ={The Org Mode 7 Reference Manual: Organize Your Life >>> with GNU Emacs}, >>> publisher ={Network Theory Ltd.}, >>> year = 2010 >>> } >>> >>> All the best, >>> Tom >>> -- >>> Thomas S. Dye >>> http://www.tsdye.com >>> >> >> Dear Tom, >> >> Thanks for these interesting and positive comments. I apologize for >> forgetting the obvious reference to Carsten's reference manual.
Re: [O] A manuscript on "reproducible research" introducing org-mode
I applaud all of this. Raw data need to be made available by default (with only a few exceptions). Org can help people reproduce all of the succeeding steps also. Another aspect is fraud, which is rampant. A psychologist in Europe recently accused of fraud was said to have been able to guard his raw data from all colleagues for *ten years*. His method? Get angry at the requester. Samuel -- The Kafka Pandemic: http://thekafkapandemic.blogspot.com
Re: [O] A manuscript on "reproducible research" introducing org-mode
Aloha Christophe, Has this article appeared in print? If so, can you forward publication details? All the best, Tom Christophe Pouzat writes: > "Thomas S. Dye" a écrit : > >> Christophe Pouzat writes: >> >>> Dear all, >>> >>> M. Delescluse, R. Franconville, S. Joucla, T. Lieury and myself (C. >>> Pouzat) have just put a manuscript entitled: "Making >>> neurophysiological data analysis reproducible. Why and how?" on a >>> pre-print server: http://hal.archives-ouvertes.fr/hal-00591455/fr/ >>> Although the paper has been written for a neurobiological journal, the >>> reader does not have to be a neuroscientist to read and understand it. >>> A toy example illustrating the use of org-mode + Babel (with Python >>> and Octave) takes a fair part of the manuscript. Other tools like R + >>> Sweave are presented and many more are mentioned. >>> >>> I thank Eric Schulte for comments on the manuscript and Eric (again) >>> together with the whole org-mode / Babel community for developing such >>> a great tool. >>> >>> Any comment, remark, suggestion on the manuscript is of course welcome. >>> >>> Christophe >>> > >> Aloha Christophe, >> >> Thank you for an interesting and useful paper. I was happy with the >> distinction you draw between reproducible analysis and reproducible >> research, which certainly applies to my field of archaeology where >> unique sites are typically destroyed by the data collection effort. I >> also think the emphasis you place on data preprocessing is just the >> right approach; inclusion of the raw data in a reproducible analysis >> opens up many possibilities, which must be a benefit to a scientific >> community's pursuit of knowledge. >> >> May I offer a suggestion? Carsten Dominik published the Org Mode 7 >> Manual last year and it would be nice to see it cited in your paper. >> >> @book{dominik10:_org_mode_refer_manual, >> author = {Carsten Dominik}, >> title ={The Org Mode 7 Reference Manual: Organize Your Life >> with GNU Emacs}, >> publisher ={Network Theory Ltd.}, >> year = 2010 >> } >> >> All the best, >> Tom >> -- >> Thomas S. Dye >> http://www.tsdye.com >> > > Dear Tom, > > Thanks for these interesting and positive comments. I apologize for > forgetting the obvious reference to Carsten's reference manual. I will > definitely include it in the next version. > I hope that people in my field will come to think the way you do about > sharing their raw data. I'm just afraid that the way is still long… > but the goal is reachable. Raw data aside, org-mode is surely a tool > which should help people experimenting with the "reproducible research > paradigm". As I wrote to Eric (Schulte), M. Delescluse and I wrote a > first RR manuscript 6 years ago based on R/Sweave. The manuscript > never got submitted for different reasons, among them, the amount of > work required to learn R and LaTeX. Learning about org-mode convinced > me that it would be worth re-activating the project. > > Christophe > > Most people are not natural-born statisticians. Left to our own > devices we are not very good at picking out patterns from a sea of > noisy data. To put it another way, we are all too good at picking out > non-existent patterns that happen to suit our purposes. > Bradley Efron & Robert Tibshirani (1993) An Introduction to the Bootstrap > > -- > > Christophe Pouzat > Laboratoire de Physiologie Cerebrale > CNRS UMR 8118 > UFR biomedicale de l'Universite Paris-Descartes > 45, rue des Saints Peres > 75006 PARIS > France > > tel: +33 (0)1 42 86 38 28 > fax: +33 (0)1 42 86 38 30 > mobile: +33 (0)6 62 94 10 34 > web: http://www.biomedicale.univ-paris5.fr/physcerv/C_Pouzat.html > -- Thomas S. Dye http://www.tsdye.com
Re: [O] A manuscript on "reproducible research" introducing org-mode
"Thomas S. Dye" a écrit : Christophe Pouzat writes: Dear all, M. Delescluse, R. Franconville, S. Joucla, T. Lieury and myself (C. Pouzat) have just put a manuscript entitled: "Making neurophysiological data analysis reproducible. Why and how?" on a pre-print server: http://hal.archives-ouvertes.fr/hal-00591455/fr/ Although the paper has been written for a neurobiological journal, the reader does not have to be a neuroscientist to read and understand it. A toy example illustrating the use of org-mode + Babel (with Python and Octave) takes a fair part of the manuscript. Other tools like R + Sweave are presented and many more are mentioned. I thank Eric Schulte for comments on the manuscript and Eric (again) together with the whole org-mode / Babel community for developing such a great tool. Any comment, remark, suggestion on the manuscript is of course welcome. Christophe Aloha Christophe, Thank you for an interesting and useful paper. I was happy with the distinction you draw between reproducible analysis and reproducible research, which certainly applies to my field of archaeology where unique sites are typically destroyed by the data collection effort. I also think the emphasis you place on data preprocessing is just the right approach; inclusion of the raw data in a reproducible analysis opens up many possibilities, which must be a benefit to a scientific community's pursuit of knowledge. May I offer a suggestion? Carsten Dominik published the Org Mode 7 Manual last year and it would be nice to see it cited in your paper. @book{dominik10:_org_mode_refer_manual, author = {Carsten Dominik}, title ={The Org Mode 7 Reference Manual: Organize Your Life with GNU Emacs}, publisher ={Network Theory Ltd.}, year = 2010 } All the best, Tom -- Thomas S. Dye http://www.tsdye.com Dear Tom, Thanks for these interesting and positive comments. I apologize for forgetting the obvious reference to Carsten's reference manual. I will definitely include it in the next version. I hope that people in my field will come to think the way you do about sharing their raw data. I'm just afraid that the way is still long… but the goal is reachable. Raw data aside, org-mode is surely a tool which should help people experimenting with the "reproducible research paradigm". As I wrote to Eric (Schulte), M. Delescluse and I wrote a first RR manuscript 6 years ago based on R/Sweave. The manuscript never got submitted for different reasons, among them, the amount of work required to learn R and LaTeX. Learning about org-mode convinced me that it would be worth re-activating the project. Christophe Most people are not natural-born statisticians. Left to our own devices we are not very good at picking out patterns from a sea of noisy data. To put it another way, we are all too good at picking out non-existent patterns that happen to suit our purposes. Bradley Efron & Robert Tibshirani (1993) An Introduction to the Bootstrap -- Christophe Pouzat Laboratoire de Physiologie Cerebrale CNRS UMR 8118 UFR biomedicale de l'Universite Paris-Descartes 45, rue des Saints Peres 75006 PARIS France tel: +33 (0)1 42 86 38 28 fax: +33 (0)1 42 86 38 30 mobile: +33 (0)6 62 94 10 34 web: http://www.biomedicale.univ-paris5.fr/physcerv/C_Pouzat.html
Re: [O] A manuscript on "reproducible research" introducing org-mode
Christophe Pouzat writes: > Dear all, > > M. Delescluse, R. Franconville, S. Joucla, T. Lieury and myself (C. > Pouzat) have just put a manuscript entitled: "Making > neurophysiological data analysis reproducible. Why and how?" on a > pre-print server: http://hal.archives-ouvertes.fr/hal-00591455/fr/ > Although the paper has been written for a neurobiological journal, the > reader does not have to be a neuroscientist to read and understand it. > A toy example illustrating the use of org-mode + Babel (with Python > and Octave) takes a fair part of the manuscript. Other tools like R + > Sweave are presented and many more are mentioned. > > I thank Eric Schulte for comments on the manuscript and Eric (again) > together with the whole org-mode / Babel community for developing such > a great tool. > > Any comment, remark, suggestion on the manuscript is of course welcome. > > Christophe > > Most people are not natural-born statisticians. Left to our own > devices we are not very good at picking out patterns from a sea of > noisy data. To put it another way, we are all too good at picking out > non-existent patterns that happen to suit our purposes. > Bradley Efron & Robert Tibshirani (1993) An Introduction to the Bootstrap > > -- > > Christophe Pouzat > Laboratoire de Physiologie Cerebrale > CNRS UMR 8118 > UFR biomedicale de l'Universite Paris-Descartes > 45, rue des Saints Peres > 75006 PARIS > France > > tel: +33 (0)1 42 86 38 28 > fax: +33 (0)1 42 86 38 30 > mobile: +33 (0)6 62 94 10 34 > web: http://www.biomedicale.univ-paris5.fr/physcerv/C_Pouzat.html > > > Aloha Christophe, Thank you for an interesting and useful paper. I was happy with the distinction you draw between reproducible analysis and reproducible research, which certainly applies to my field of archaeology where unique sites are typically destroyed by the data collection effort. I also think the emphasis you place on data preprocessing is just the right approach; inclusion of the raw data in a reproducible analysis opens up many possibilities, which must be a benefit to a scientific community's pursuit of knowledge. May I offer a suggestion? Carsten Dominik published the Org Mode 7 Manual last year and it would be nice to see it cited in your paper. @book{dominik10:_org_mode_refer_manual, author = {Carsten Dominik}, title ={The Org Mode 7 Reference Manual: Organize Your Life with GNU Emacs}, publisher ={Network Theory Ltd.}, year = 2010 } All the best, Tom -- Thomas S. Dye http://www.tsdye.com
[O] A manuscript on "reproducible research" introducing org-mode
Dear all, M. Delescluse, R. Franconville, S. Joucla, T. Lieury and myself (C. Pouzat) have just put a manuscript entitled: "Making neurophysiological data analysis reproducible. Why and how?" on a pre-print server: http://hal.archives-ouvertes.fr/hal-00591455/fr/ Although the paper has been written for a neurobiological journal, the reader does not have to be a neuroscientist to read and understand it. A toy example illustrating the use of org-mode + Babel (with Python and Octave) takes a fair part of the manuscript. Other tools like R + Sweave are presented and many more are mentioned. I thank Eric Schulte for comments on the manuscript and Eric (again) together with the whole org-mode / Babel community for developing such a great tool. Any comment, remark, suggestion on the manuscript is of course welcome. Christophe Most people are not natural-born statisticians. Left to our own devices we are not very good at picking out patterns from a sea of noisy data. To put it another way, we are all too good at picking out non-existent patterns that happen to suit our purposes. Bradley Efron & Robert Tibshirani (1993) An Introduction to the Bootstrap -- Christophe Pouzat Laboratoire de Physiologie Cerebrale CNRS UMR 8118 UFR biomedicale de l'Universite Paris-Descartes 45, rue des Saints Peres 75006 PARIS France tel: +33 (0)1 42 86 38 28 fax: +33 (0)1 42 86 38 30 mobile: +33 (0)6 62 94 10 34 web: http://www.biomedicale.univ-paris5.fr/physcerv/C_Pouzat.html