Hi Eric, Eric Schulte <eric.schu...@gmx.com> writes:
> t...@tsdye.com (Thomas S. Dye) writes: > >> Aloha Michael, >> >> Michael Hannon <jm_han...@yahoo.com> writes: >> >>> Greetings. I'm sitting in on a weekly, informal, "brown-bag" seminar on >>> data >>> technologies in statistics. There are more people attending the seminar >>> than >>> there are weeks in which to give talks, so I may get by with being my usual, >>> passive-slug self. >>> >>> But I thought it might be useful to have a contingency plan and decided that >>> giving a brief talk about Babel might be useful/instructive. I thought (and >>> think) that mushing together (with attribution) some of the content of the >>> paper [1] by The Gang of Four and the content of Eric's talk [2] might be a >>> good approach. (BTW, if this isn't legal, desirable, permissible, etc., >>> this >>> would be a good time to tell me.) >>> > > I would be happy for you to re-use these materials. > >>> >>> I liked the Pascal's Triangle example (which morphed from elisp to Python, >>> or >>> vice versa, in the two references), but I was afraid that the elisp routine >>> "pst-check", used as a check on the correctness of the previously-generated >>> Pascal's triangle, might be too esoteric for this audience, not to mention >>> me. >>> (The recursive Fibonacci function is virtually identical in all languages, >>> but the second part is more obscure.) >>> > > I was giving a presentation to a local lisp/scheme user group, so I > figured I'd spare them the pain of trying to read python code :). > >>> >>> I thought it should be possible to use R to do the same sanity check, as R >>> would be much more-familiar to this audience (and its use would still >>> demonstrate the meta-language feature of Babel). >>> >>> Unfortunately, I haven't been able to find a way to communicate the output >>> of >>> the Pascal's Triangle example to an R source-code block. The gist of the >>> problem seems to be that regardless of how I try to grab the data (scan, >>> readLines, etc.) Babel always ends up trying to read a data frame (table) >>> and >>> I get an error similar to: >>> > > I present some options below specific to Tom's discussion, but another > option may be to use the ":results output" option on a python code block > which prints the table to STDOUT, and then use something line readLines > to read from the resulting string into R. > I didn't have any luck with :results output, but didn't spend much time trying to figure it out. >>> >>> <<<<<< >>>> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, >>>> : line 1 did not have 5 elements >>> >>> Enter a frame number, or 0 to exit >>> >>> 1: read.table("/tmp/babel-3780tje/R-import-3780Akj", header = FALSE, >>> row.names >>> = NULL, sep = " >>>>>>>>> >>> >>> If I construct a table "by hand" with all of the cells occupied, everything >>> goes OK. For instance: >>> >>> <<<<<< >>> #+TBLNAME: some-junk >>> | 1 | 0 | 0 | 0 | >>> | 1 | 1 | 0 | 0 | >>> | 1 | 2 | 1 | 0 | >>> | 1 | 3 | 3 | 1 | >>> >>> #+NAME: read-some-junk(sj_input=some-junk) >>> #+BEGIN_SRC R >>> >>> rowSums(sj_input) >>> >>> #+END_SRC >>> >>> #+RESULTS: read-some-junk >>> | 1 | >>> | 2 | >>> | 4 | >>> | 8 | >>>>>>>>> >>> >>> But the following gives the kind of error I described above: >>> >>> <<<<<< >>> #+name: pascals_triangle >>> #+begin_src python :var n=5 :exports none :return pascals_triangle(5) >>> def pascals_triangle(n): >>> if n == 0: >>> return [[1]] >>> prev_triangle = pascals_triangle(n-1) >>> prev_row = prev_triangle[n-1] >>> this_row = map(sum, zip([0] + prev_row, prev_row + [0])) >>> return prev_triangle + [this_row] >>> >>> pascals_triangle(n) >>> #+end_src >> >> A few things are wrong at this point. It seems the JSS article has >> an error in the header of the pascals_triangle source block. AFAIK >> there is no header argument :return. I don't know how :return >> pascals_triangle(5) got there, but am fairly certain it shouldn't be. >> > > The :return header argument *is* a supported header argument of python > code blocks and is not an error. The python code block should run w/o > error and without the extra "return pascals_triangle(n)" at the bottom. > The following works for me. > > #+name: pascals_triangle > #+begin_src python :var n=5 :exports none :return pascals_triangle(5) > def pascals_triangle(n): > if n == 0: > return [[1]] > prev_triangle = pascals_triangle(n-1) > prev_row = prev_triangle[n-1] > this_row = map(sum, zip([0] + prev_row, prev_row + [0])) > return prev_triangle + [this_row] > > #+end_src > > #+RESULTS: pascals_triangle > | 1 | | | | | | > | 1 | 1 | | | | | > | 1 | 2 | 1 | | | | > | 1 | 3 | 3 | 1 | | | > | 1 | 4 | 6 | 4 | 1 | | > | 1 | 5 | 10 | 10 | 5 | 1 | > > [...] I'm beginning to see why you have strong feelings about python. In the code above, the blank line before #+end_src is necessary and must not contain any spaces, and :var n can be set to anything, since it is declared for initialization only. The code in the JSS article doesn't run for me with a recent Org-mode unless I add a blank line before #+end_src, or remove the :return header argument. If I remove the :return header argument, then the need for the blank line goes away. The following code block seems to work: #+name: pascals-triangle #+begin_src python :var n=2 :exports none def pascals_triangle(n): if n == 0: return [[1]] prev_triangle = pascals_triangle(n-1) prev_row = prev_triangle[n-1] this_row = map(sum, zip([0] + prev_row, prev_row + [0])) return prev_triangle + [this_row] return pascals_triangle(n) #+end_src #+RESULTS: pascals-triangle | 1 | | | | 1 | 1 | | | 1 | 2 | 1 | I'm guessing that the need for a blank line when using :results has arisen since the JSS article was published, because the article was generated from source code and didn't show any errors. If I have this right (a big if), then might it be possible to re-establish the old behavior so the JSS code works? >> >> I vaguely remember that it once was possible to pass variables in >> through the name line, but I couldn't find this syntax in some fairly >> recent documentation. > > This style of passing arguments is still supported, but not necessarily > encouraged by the documentation. > >> It does appear to work still using a recent Org-mode. If I rename the >> results and then pass that to the source code block, all is well. >> >> #+RESULTS: pascals-tri >> | 1 | | | | | | >> | 1 | 1 | | | | | >> | 1 | 2 | 1 | | | | >> | 1 | 3 | 3 | 1 | | | >> | 1 | 4 | 6 | 4 | 1 | | >> | 1 | 5 | 10 | 10 | 5 | 1 | >> >> >> #+name: pst-checkR(p=pascals-tri) >> #+BEGIN_SRC R >> p >> #+END_SRC >> >> #+RESULTS: pst-checkR >> >> | 1 | nil | nil | nil | nil | nil | >> | 1 | 1 | nil | nil | nil | nil | >> | 1 | 2 | 1 | nil | nil | nil | >> | 1 | 3 | 3 | 1 | nil | nil | >> | 1 | 4 | 6 | 4 | 1 | nil | >> | 1 | 5 | 10 | 10 | 5 | 1 | >> >> This looks like a bug to me, but Eric S. will know better what might be >> going on. > > The above is due to the inability of R (or at least of the read.table > function) to read in tables with different row length. The process of > writing to an Org-mode table and *then* referencing that table as Tom > suggests above has the side effect of filling in blank spots in the > final exported table, turning what would otherwise be something like > > 1 > 1 1 > 1 2 1 > > into something like > > 1 "" "" > 1 1 "" > 1 2 1 > Thanks for this explanation. It makes sense that mapping a python data structure to an R data structure would involve an intermediate representation. All the best, Tom > You could also use a function like the following to explicitly fill in > these missing lines. > > #+name: padded_pascals_triangle > #+begin_src emacs-lisp :var data=pascals_triangle > (let ((max-length (apply #'max (mapcar #'length data)))) > (mapcar (lambda (row) > (append row (make-vector (- max-length (length row)) "") nil)) > data)) > #+end_src > >> I can't do much more than this, but I'm optimistic things will be >> sorted out before your turn to speak at the seminar rolls around. >> >> Thanks for bringing the error in the JSS article to light. >> >> All the best, >> Tom >> > > I often have to explicitly convert data read into R code blocks as a > table into some other data structure like a vector or a matrix. I run > into this myself when trying to use the statistical functions of R. It > generally takes a while to look up the function to do the conversion, > but I imagine that there is a reason why people who know more R than I > do chose to make tables the default data type for data read into R > blocks. > > Best, > > Combining the examples above yields the following, > > > #+name: pascals_triangle > #+begin_src python :var n=5 :exports none :return pascals_triangle(5) > :results vector > def pascals_triangle(n): > if n == 0: > return [[1]] > prev_triangle = pascals_triangle(n-1) > prev_row = prev_triangle[n-1] > this_row = map(sum, zip([0] + prev_row, prev_row + [0])) > return prev_triangle + [this_row] > > #+end_src > > #+name: padded_pascals_triangle > #+begin_src emacs-lisp :var data=pascals_triangle > (let ((max-length (apply #'max (mapcar #'length data)))) > (mapcar (lambda (row) > (append row (make-vector (- max-length (length row)) "") nil)) > data)) > #+end_src > > #+begin_src R :var data=padded_pascals_triangle > data > #+end_src > > #+RESULTS: > | 1 | nil | nil | nil | nil | nil | > | 1 | 1 | nil | nil | nil | nil | > | 1 | 2 | 1 | nil | nil | nil | > | 1 | 3 | 3 | 1 | nil | nil | > | 1 | 4 | 6 | 4 | 1 | nil | > | 1 | 5 | 10 | 10 | 5 | 1 | > > >> >>>>>>>>> >>> >>> Note that I don't really want to do rowSums in this case. I'm just trying >>> to >>> demonstrate the error. >>> >>> Of course, it's clear that the first line does NOT contain five elements, >>> nor >>> does the second, etc., as all of the above-diagonal elements are blanks. >>> >>> But I've been unable to find an R input function that doesn't end up >>> treating >>> the source data as a table, i.e., in the context of Babel source blocks -- R >>> is "happy" to read a lower-diagonal structure. See the appendix for an >>> example. >>> >>> Any suggestions? Note that I'm happy to acknowledge that my own ignorance >>> of >>> R and/or Babel might be the source of the problem. If so, please enlighten >>> me. >>> >>> Thanks. >>> >>> -- Mike >>> >>> [1] http://www.jstatsoft.org/v46/i03 >>> [2] https://github.com/eschulte/babel-presentation >>> >>> <<<<<< >>> Appendix >>> -------- >>> >>> >>> $ cat pascal.dat >>> 1 >>> 1 1 >>> 1 2 1 >>> 1 3 3 1 >>> 1 4 6 4 1 >>> >>> $ R --vanilla < pascal.R >>> >>> R version 2.15.0 (2012-03-30) >>> Copyright (C) 2012 The R Foundation for Statistical Computing >>> ISBN 3-900051-07-0 >>> Platform: x86_64-redhat-linux-gnu (64-bit) >>> . >>> . >>> . >>> >>>> x <- readLines("pascal.dat") >>>> x >>> [1] "1" "1 1" "1 2 1" "1 3 3 1" "1 4 6 4 1" >>>> str(x) >>> chr [1:5] "1" "1 1" "1 2 1" "1 3 3 1" "1 4 6 4 1" >>>> >>>> y <- scan("pascal.dat") >>> Read 15 items >>>> y >>> [1] 1 1 1 1 2 1 1 3 3 1 1 4 6 4 1 >>>> str(y) >>> num [1:15] 1 1 1 1 2 1 1 3 3 1 ... >>>> >>>> z <- read.table("pascal.dat", header=FALSE) >>> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, >>> : >>> line 1 did not have 5 elements >>> Calls: read.table -> scan >>> Execution halted >>> >>> -- Thomas S. Dye http://www.tsdye.com