Re: [O] Best practices for literate programming [was: Latex export of tables]

2013-04-21 Thread Rasmus Pank Roulund

Dear Tom,

 I suppose this depends on what is meant by reproducible.

 My goal is to produce a compendium as defined by Gentleman and Lang
 (see Gentleman R, Lang DT (2004). Statistical Analyses and Reproducible
 Research. Technical report, Bioconductor Project. URL
 http://www.bepress.com/bioconductor/paper2).  

 I keep the init.el file as a babel source block with the reproducible
 document, so it can be tangled. I also have an editing setup in a babel
 source block that activates many of the same features handled by the
 init.el file, but also configures the new exporter to look for init.el
 (which might have a different name). The filters are all part of the Org
 document, too, and get pulled into the init.el file with noweb
 references.

My issue here is that this approach might lead to copy-paste
preambles which may or may not be desirable.  I can certainly see
the attraction in being able to just tangle the setup.  In fact for my
thesis I also had a preamble.tex blog in my file.  Your proposed setup
here is perhaps better in that it uses emacs-lisp.

Still, say I'm working on two files A and B.  If I fix a bug in
preamble A I would have to manually copy it over to B.  

Thus, the main question is how to distribute updates?  I guess one
could keep a separate file, but then we are back at square one in a
way. . .

One possibility might be a file structure like this

setup.org
A/project-A.org
A/setup-A.org
B/project-B.org
B/setup-B.org

where A and B both has a block like
#+BEGIN_SRC org
* Preamlbe:noexport:
#+INCLUDE: ../setup.org
#+INCLUDE: setup-A.org
#+END_SRC

To ship it off one would only have to write a command to replacing
#+INCLUDE with its content.  The exporter could likely be used for
this and one could produce an archive version when signing off a
project.

Even more robust, #+INCLUDE: would look for files in org-directory (it
might already do, I didn't check).

Am I missing something obvious (probably?) in the above stream of
random thoughts?  It's kind of a LaTeX-ish way of dealing with it, I
guess.

 I am able to distribute the compendium, typically as a single
 document (sometimes with associated data files produced by an
 on-line service that can't be used programmatically), which I
 believe is a good step toward reproducibility.

Agreed.

–Rasmus

-- 
Send from my Emacs



Re: [O] Best practices for literate programming [was: Latex export of tables]

2013-04-18 Thread Rasmus

Thomas,

 Tom, do tell us more about what these habits are.

 The new exporter is really your friend.  Where before I might choose to
 generate a LaTeX block, now I look to generate Org output and then count
 on the exporter to do the right thing on the way to pdf.  

 The exporter's attribute system is very easy to use.  The attributes you
 need to access are always right there.

 I've also come to rely on filters quite a bit. I use them for
 non-breaking spaces, the plus/minus symbol, and for the multiple
 citation commands used by biblatex (e.g., \parencites). There seems to
 be a move afoot to collect filters so they can be widely distributed.
 I'd like to see the filters go to the Library of Babel, but for
 reproducible research it is probably best to keep them with the source
 document so there is no doubt about the fidelity of filter code.

I too rely heavily on filters and customizations.  I haven't been able
to fully appreciate the asynchronous exporter yet.

For instance I set some defaults for tables, pictures, add lots of
entities etc. in my init file, and I went as far as writing a separate
init file just loading just the org stuff.  Now, this is clearly /not/
a very reproducible way of doing this.

So I'm really interested in hearing or seeing implementation where the
goal is reproducibility.  On one hand I can appreciate keeping Org
close to a vanilla state.  On the other hand, I'd have to overwrite
defaults every time (e.g. I /always/ want booktab tables).  I guess I
could keep an emacs-lisp block in the top of the file specifying
stuff, but it also seems kind of tedious (copy-pasting every time).
(Perhaps this could be resolved by loading external files hosted
somewhere accessible).

–Rasmus

-- 
. . . The proofs are technical in nature and provides no real understanding.




Re: [O] Best practices for literate programming [was: Latex export of tables]

2013-04-18 Thread Aaron Ecay
Hi Rasmus,

2013ko apirilak 18an, Rasmus-ek idatzi zuen:
 I too rely heavily on filters and customizations.  I haven't been able
 to fully appreciate the asynchronous exporter yet.
 
 For instance I set some defaults for tables, pictures, add lots of
 entities etc. in my init file, and I went as far as writing a separate
 init file just loading just the org stuff.  Now, this is clearly /not/
 a very reproducible way of doing this.
 
 So I'm really interested in hearing or seeing implementation where the
 goal is reproducibility.  On one hand I can appreciate keeping Org
 close to a vanilla state.  On the other hand, I'd have to overwrite
 defaults every time (e.g. I /always/ want booktab tables).  I guess I
 could keep an emacs-lisp block in the top of the file specifying
 stuff, but it also seems kind of tedious (copy-pasting every time).
 (Perhaps this could be resolved by loading external files hosted
 somewhere accessible).

If your external org configuration file were kept under version control
(I’ll discuss git but the principle is general), then reproducibility
would be possible.  There are ways of embedding git hashes in LaTeX
documents (for one example:
http://thorehusfeldt.net/2011/05/13/including-git-revision-identifiers-in-latex/),
and of course org could help automate this.  Including the git hash of
the document itself, the config file, and org-mode’s own code (assuming
these are kept in 3 separate repos) should allow perfect reproducibility
(modulo incompatible changes in emacs, I guess).

It would be interesting for org to have an ability to reference files
not just by name, but by git revision.  So that you could do something
like (where 123456 is some git hash):
#+include: [[gitbare:/path/to/repo::123456:my-org-setup-file.org]]
and have org take care of checking out the proper revision and loading
the file in the usual way.  This syntax is already implemented, for
plain links, in contrib/lisp/org-git-link.el, so it is just a matter
of making #+include and friends understand links in addition to
filenames.

-- 
Aaron Ecay



Re: [O] Best practices for literate programming [was: Latex export of tables]

2013-04-18 Thread Rasmus
Aaron Ecay aarone...@gmail.com writes:

 If your external org configuration file were kept under version control
 (I’ll discuss git but the principle is general), then reproducibility
 would be possible.  There are ways of embedding git hashes in LaTeX
 documents (for one example:
 http://thorehusfeldt.net/2011/05/13/including-git-revision-identifiers-in-latex/),
 and of course org could help automate this.  Including the git hash of
 the document itself, the config file, and org-mode’s own code (assuming
 these are kept in 3 separate repos) should allow perfect reproducibility
 (modulo incompatible changes in emacs, I guess).

Sounds interesting.  I'll check it out. 


 It would be interesting for org to have an ability to reference files
 not just by name, but by git revision.  So that you could do something
 like (where 123456 is some git hash):
 #+include: [[gitbare:/path/to/repo::123456:my-org-setup-file.org]]
 and have org take care of checking out the proper revision and loading
 the file in the usual way.  This syntax is already implemented, for
 plain links, in contrib/lisp/org-git-link.el, so it is just a matter
 of making #+include and friends understand links in addition to
 filenames.

Now that is a great idea that allows for both incremental
improvements while still retaining compatibility for old files.

–Rasmus

-- 
And let me remind you also that moderation in the pursuit of justice
is no virtue




Re: [O] Best practices for literate programming [was: Latex export of tables]

2013-04-18 Thread Thomas S. Dye
Aloha Rasmus,

Rasmus ras...@gmx.us writes:

 The following message is a courtesy copy of an article
 that has been posted to gmane.emacs.orgmode as well.


 Thomas,

 Tom, do tell us more about what these habits are.

 The new exporter is really your friend.  Where before I might choose to
 generate a LaTeX block, now I look to generate Org output and then count
 on the exporter to do the right thing on the way to pdf.  

 The exporter's attribute system is very easy to use.  The attributes you
 need to access are always right there.

 I've also come to rely on filters quite a bit. I use them for
 non-breaking spaces, the plus/minus symbol, and for the multiple
 citation commands used by biblatex (e.g., \parencites). There seems to
 be a move afoot to collect filters so they can be widely distributed.
 I'd like to see the filters go to the Library of Babel, but for
 reproducible research it is probably best to keep them with the source
 document so there is no doubt about the fidelity of filter code.

 I too rely heavily on filters and customizations.  I haven't been able
 to fully appreciate the asynchronous exporter yet.

 For instance I set some defaults for tables, pictures, add lots of
 entities etc. in my init file, and I went as far as writing a separate
 init file just loading just the org stuff.  Now, this is clearly /not/
 a very reproducible way of doing this.

I suppose this depends on what is meant by reproducible.

My goal is to produce a compendium as defined by Gentleman and Lang
(see Gentleman R, Lang DT (2004). Statistical Analyses and Reproducible
Research. Technical report, Bioconductor Project. URL
http://www.bepress.com/bioconductor/paper2).  

I keep the init.el file as a babel source block with the reproducible
document, so it can be tangled. I also have an editing setup in a babel
source block that activates many of the same features handled by the
init.el file, but also configures the new exporter to look for init.el
(which might have a different name). The filters are all part of the Org
document, too, and get pulled into the init.el file with noweb
references.

A compendium with this structure gets past the problem, often aired on
the ML, that there is something in my setup that causes unexpected
behavior.  The Org setup is completely contained in the compendium.

I am able to distribute the compendium, typically as a single document
(sometimes with associated data files produced by an on-line service
that can't be used programmatically), which I believe is a good step
toward reproducibility.

Of course, it leaves open the question of changes in the underlying
software. This is a real problem. Org 8.0, with its new (and sweet)
exporter has broken my first two compendia. Conceivably, changes in
Emacs might break a compendium, as could changes in all the other
software referenced by babel code blocks.  Aaron Ecay seems to be on to
a possible mechanism to take care of at least some of this.  AFAICT,
however, his solution doesn't change the utility of the compendium,
which seems to me an integral part of the reproducibility equation.

What do you think?  


 So I'm really interested in hearing or seeing implementation where the
 goal is reproducibility.  On one hand I can appreciate keeping Org
 close to a vanilla state.  On the other hand, I'd have to overwrite
 defaults every time (e.g. I /always/ want booktab tables).  I guess I
 could keep an emacs-lisp block in the top of the file specifying
 stuff, but it also seems kind of tedious (copy-pasting every time).
 (Perhaps this could be resolved by loading external files hosted
 somewhere accessible).

Some journals specify which LaTeX packages can or cannot be used.
PLOS-One, for instance, doesn't use booktab tables, so a reproducible
research document sent to them couldn't include your default setting.
My sense of the publishing world is that it is sufficiently variable
eventually to break almost any default you might hope to establish.   

Incidentally, I think this is an area ripe for growth within Org
mode--additions to the Library of Babel that configure a compendium to
produce LaTeX code that meets the requirements of particular publishing
venues. It would be ideal to do something like init-plos-one and
then, when the journal sends back your paper with a digital pink slip,
change that to something like init-nature and send it off again.

All the best,
Tom 

-- 
T.S. Dye  Colleagues, Archaeologists
735 Bishop St, Suite 315, Honolulu, HI 96813
Tel: 808-529-0866, Fax: 808-529-0884
http://www.tsdye.com



Re: [O] Best practices for literate programming [was: Latex export of tables]

2013-04-17 Thread Suvayu Ali
Hi Vikas,

On Wed, Apr 17, 2013 at 03:40:22AM +0530, Vikas Rawal wrote:
 
  At one point I realised the problem and made the decision to
  split things into two kinds of files: static content (document
  structuring, text, plots, etc), and dynamic content (babel, TikZ blocks
  that generate tables, plots, figures, etc used by the static content
  files).  It is still reproducible research, but modular and less hacky
  (hence more stable).
 
 This is indeed a very neat approach. Would you kindly elaborate?
 
 Would it be too much work for you to get some illustrations from your
 work?

Well ... it was couple of years back, the Org version was quite
different, e.g. babel was rapidly evolving.  It might be a fair bit of
work to get it working again.  That said, last year I gave a talk in an
internal workshop, I made the plots with the attached file.  I didn't
spend time to make sure everything is pretty, so the legend and titles
might be a little wonky.  Just evaluating the two main source blocks
should give you two plots in pdf files.

 In your scheme of things, how do you finally combine the static and
 the dynamic content?
 
 Any chance that you could release the source of something like a
 chapter of your thesis for people to see? Or may be create something
 with dummy content?

The idea is to keep the dynamic content on separate org files which you
export less frequently during the course of your writing, e.g. any
tables that are inputs for source blocks.  Evaluating these blocks, or
exporting these dynamic files (whichever is your preference) generates
the graphic which is then used in the static file.  This is not limited
to plots, you could write org/LaTeX tables to separate files.  You can
then easily include those in your static files.

My main motivation for this was to make the export process simpler.  And
since the complicated interacting bits are all isolated and modularised,
there are fewer things that go wrong and many files are updated only
when required, hence faster too!

Anyway, this is all probably very vague without working examples.  I'll
try to come up with something, but I have been rather busy for the last
year or so and do not see any sign of respite in the near future :-/.
I'll get this fleshed out at some point, just don't know how soon.

Hope this was helpful in some way,

:)

-- 
Suvayu

Open source is the future. It sets us free.
#+STARTUP: overview
#+PROPERTY: noweb yes
#+PROPERTY: results silent
#+BIND: org-confirm-babel-evaluate nil


* Gnuplot source preamble   :src:
  :PROPERTIES:
  :VISIBILITY: folded
  :END:

#+name: gnuplot-preamble
#+begin_src gnuplot
  reset
  set terminal pdfcairo color size 21cm,14.8cm
  set termoption enhanced
  set encoding utf8
  set termoption font DejaVuSerif,8
  # set output '|display png:-'
  set grid back
  set style line 1 linewidth 9 pointtype 1  linecolor rgb 'orange'
  set style line 2 pointsize 1 pointtype 5  linecolor rgb 'forest-green'
  set style line 3 pointsize 1 pointtype 7  linecolor rgb 'red'
  set style line 4 pointsize 1 pointtype 9  linecolor rgb 'blue'
  set style line 5 pointsize 1 pointtype 11 linecolor rgb 'dark-gray'
  set style line 6 pointsize 1 pointtype 13 linecolor rgb 'brown'
  set style line 7 linewidth 7 pointtype 19 linecolor rgb 'black'
  set style line 10 linewidth 2 linecolor rgb 'black'
  set style line 11 linewidth 5 linecolor rgb 'red'
  set key outside
  set key box linestyle 10
#+end_src


* BF Upper Limit summary plots
** Gnuplot source   :src:
#+name: limits-preamble
#+begin_src gnuplot
  set log y
  set format y 10^{%L}
  set ylabel 'BF Upper Limit'
  set xtics nomirror rotate by 90 offset character 0,-3
#+end_src

*** B⁺ → h⁻l⁺l⁺ / D⁻l⁺l⁺  :Bplus:
#+begin_src gnuplot :noweb yes :var limits=Bpluslimits
  gnuplot-preamble
  limits-preamble
  set xrange [0:8]
  set yrange [1E-14:1E-5]
  set label 'BF Upper Limits:' at graph 1.02,0.55 font ',10'
  set label ' B⁺ → h⁻l⁺l⁺' at graph 1.02,0.5
  set label ' B⁺ → D⁽*⁾⁻l⁺l⁺' at graph 1.02,0.45
  set label 'LHCb limits \@ 95% C.L.' at graph 1.02,0.37 font ',7'
  set label 'Other limits \@ 90% C.L.' at graph 1.02,0.33 font ',7'
  set xtics (K⁻e⁺e⁺ 1, K⁻μ⁺μ⁺ 2, π⁻e⁺e⁺ 3, π⁻μ⁺μ⁺ 4, D⁻e⁺e⁺ 5, 
D⁻μ⁺μ⁺ 6, D*⁻μ⁺μ⁺ 7)
  set output Bpluslimits.pdf
  plot $limits using 1:2 title 'Theory' linestyle 1, \
   $limits using 1:3 title 'BaBar' linestyle 2, \
   $limits using 1:4 title 'Belle' linestyle 3, \
   $limits using 1:5 title 'LHCb' linestyle 4, \
   $limits using 1:6 title 'LHCb year-end' linestyle 5, \
   $limits using 1:7 title 'LHCb upgrade' linestyle 6
   # 1E-10 with lines linestyle 10 title ''
   # 3.1E-9 with lines linestyle 11
  set output
#+end_src

*** D⁺ → h⁻l⁺l⁺ / Dₛ⁺ → h⁻l⁺l⁺:Dplus:
#+begin_src gnuplot :noweb yes :var limits=Dpluslimits
  

Re: [O] Best practices for literate programming [was: Latex export of tables]

2013-04-17 Thread Rainer M. Krug
Suvayu Ali fatkasuvayu+li...@gmail.com writes:

 Hi Vikas,

 On Wed, Apr 17, 2013 at 03:40:22AM +0530, Vikas Rawal wrote:
 
  At one point I realised the problem and made the decision to
  split things into two kinds of files: static content (document
  structuring, text, plots, etc), and dynamic content (babel, TikZ blocks
  that generate tables, plots, figures, etc used by the static content
  files).  It is still reproducible research, but modular and less hacky
  (hence more stable).
 
 This is indeed a very neat approach. Would you kindly elaborate?
 
 Would it be too much work for you to get some illustrations from your
 work?

 Well ... it was couple of years back, the Org version was quite
 different, e.g. babel was rapidly evolving.  It might be a fair bit of
 work to get it working again.  That said, last year I gave a talk in an
 internal workshop, I made the plots with the attached file.  I didn't
 spend time to make sure everything is pretty, so the legend and titles
 might be a little wonky.  Just evaluating the two main source blocks
 should give you two plots in pdf files.

 In your scheme of things, how do you finally combine the static and
 the dynamic content?
 
 Any chance that you could release the source of something like a
 chapter of your thesis for people to see? Or may be create something
 with dummy content?

 The idea is to keep the dynamic content on separate org files which you
 export less frequently during the course of your writing, e.g. any
 tables that are inputs for source blocks.  Evaluating these blocks, or
 exporting these dynamic files (whichever is your preference) generates
 the graphic which is then used in the static file.  This is not limited
 to plots, you could write org/LaTeX tables to separate files.  You can
 then easily include those in your static files.

 My main motivation for this was to make the export process simpler.  And
 since the complicated interacting bits are all isolated and modularised,
 there are fewer things that go wrong and many files are updated only
 when required, hence faster too!

 Anyway, this is all probably very vague without working examples.  I'll
 try to come up with something, but I have been rather busy for the last
 year or so and do not see any sign of respite in the near future :-/.
 I'll get this fleshed out at some point, just don't know how soon.

 Hope this was helpful in some way,

 :)
#secure method=pgpmime mode=sign
I did not follow the initial thread, but the new header caught my
attentian, as I am doing something similar with papers. Nothing against
org for writing papers, but I prefer LyX [1]. But for doing the analysis,
org together, nothing beats org. So in my org file I have the
analysis which creates graphs on export (and a basic report of the
analysis, including all the source code necessary, which I can then use
as an appendix for the paper).

These graphs are then inserted in the lyx file. I assume, you used
something similar, only that the oputput can then be used in the org
file (thesis) - correct?

Cheers,

Rainer

Footnotes: 
[1]  http://www.lyx.org - very nice LaTeX frontend.

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, 
UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



Re: [O] Best practices for literate programming [was: Latex export of tables]

2013-04-17 Thread Suvayu Ali
Hi Rainer,

On Wed, Apr 17, 2013 at 11:55:50AM +0200, Rainer M. Krug wrote:
 
 I did not follow the initial thread, but the new header caught my
 attentian, as I am doing something similar with papers. Nothing against
 org for writing papers, but I prefer LyX [1]. But for doing the analysis,
 org together, nothing beats org. So in my org file I have the
 analysis which creates graphs on export (and a basic report of the
 analysis, including all the source code necessary, which I can then use
 as an appendix for the paper).
 
 These graphs are then inserted in the lyx file. I assume, you used
 something similar, only that the oputput can then be used in the org
 file (thesis) - correct?

Yes something like that; usually for me analysis code is so complicated
that doing it inside Org would be madness :-p, I have dedicated software
projects for that.  I only use Org for simple spreadsheet operations in
tables and eventually plotting them.  These then get included in the
final thesis file.

Cheers,

-- 
Suvayu

Open source is the future. It sets us free.



Re: [O] Best practices for literate programming [was: Latex export of tables]

2013-04-16 Thread Thomas S. Dye
Aloha Vikas,

Vikas Rawal vikasli...@agrarianresearch.org writes:

 I've been down it too many times myself. The habits I've developed
 over time have helped, but I think they are less systematic than
 what you've devised.

 Tom, do tell us more about what these habits are.

The new exporter is really your friend.  Where before I might choose to
generate a LaTeX block, now I look to generate Org output and then count
on the exporter to do the right thing on the way to pdf.  

The exporter's attribute system is very easy to use.  The attributes you
need to access are always right there.

I've also come to rely on filters quite a bit. I use them for
non-breaking spaces, the plus/minus symbol, and for the multiple
citation commands used by biblatex (e.g., \parencites). There seems to
be a move afoot to collect filters so they can be widely distributed.
I'd like to see the filters go to the Library of Babel, but for
reproducible research it is probably best to keep them with the source
document so there is no doubt about the fidelity of filter code.

All the best,
Tom

-- 
T.S. Dye  Colleagues, Archaeologists
735 Bishop St, Suite 315, Honolulu, HI 96813
Tel: 808-529-0866, Fax: 808-529-0884
http://www.tsdye.com