On Thursday, February 11, 2016 at 3:06:45 PM UTC-6, ivo welch wrote: > > > hi doug---and vice-versa. it's interesting that a core function (reading > a .csv file) would not be in a native julia library. when are you > switching your students to julia? regards, /iaw >
Writing a function to read a .csv file is not trivial - partly because CSV is not well-defined. It is also the case of an itch getting scratched - if those working on Julia with the skills to write such a function don't have a need to read .csv files that particular functionality stagnates. The definition and functionality of data frames, which are the natural output when reading a CSV file, in Julia is still being debated. In R the choices were much easier because R was designed to emulate S version 3 in which a data frame was a central construct. Sacrifices in performance were made to allow for checking for NA's during each atomic arithmetic operation. That trade-off wouldn't fly in Julia. Also R vector structures all allow for element names - again at an expense in performance. I'm not really in the position to convert my students as I am now an Emeritus Professor. I do still offer a seminar series on "Statistics with Julia" and have convinced some students to use Julia in thesis research. I would be quite happy with Julia if only git and I got along better. I just lost three days worth of work this morning because of yet another git disaster. > > ---- > Ivo Welch (ivo....@gmail.com <javascript:>) > http://www.ivo-welch.info/ > J. Fred Weston Distinguished Professor of Finance > Anderson School at UCLA, C519 > Free Finance Textbook, http://book.ivo-welch.info/ > Exec Editor, Critical Finance Review, > http://www.critical-finance-review.org/ > Editor and Publisher, FAMe, http://www.fame-jagazine.com/ > > On Thu, Feb 11, 2016 at 12:37 PM, Douglas Bates <dmb...@gmail.com > <javascript:>> wrote: > >> Hi Ivo, >> >> Good to hear from you. >> >> On Wednesday, February 10, 2016 at 9:58:37 AM UTC-6, ivo welch wrote: >>> >>> >>> ladies and gents---I am not (yet) a julia user. >>> >>> may I suggest adding more examples into two places where julia users >>> will face starting hurdles? >>> >>> [1] the I/O docs of julia. like, reading and writing csv files that are >>> compressed and decompressed on-the-fly, even if not in the ultimate >>> efficient manner. a large fraction of the time and frustration of new >>> users is consumed by the task of shoehorning data into and out of new >>> computer languages. with all of R's problem, the ' d <- read.csv("f.csv")' >>> and 'd<-read.csv(pipe(paste("gzcat ", fname)))' reduced this entry >>> frustration greatly. perhaps xml file reading and writing. perhaps... >>> >>> [2] more 'standard task' programs would be great. read a csv file, run >>> a regression according to variable names on the command line, print output, >>> draw a graph. I know there are fragments throughout the docs, but some >>> section with ready to run complete programs would be good, perhaps at the >>> end of the manual. >>> >>> in a year, I hope to switch my students from R to julia. >>> >> >> My main use of the RCall package is to import datasets from R into >> Julia. If I have a dataset in an R package I use, e.g. >> >> julia> using RCall >> >> julia> ds = rcopy("lme4::Dyestuff") >> 30x2 DataFrames.DataFrame >> | Row | Batch | Yield | >> |-----|-------|--------| >> | 1 | "A" | 1545.0 | >> | 2 | "A" | 1440.0 | >> | 3 | "A" | 1440.0 | >> | 4 | "A" | 1520.0 | >> | 5 | "A" | 1580.0 | >> | 6 | "B" | 1540.0 | >> | 7 | "B" | 1555.0 | >> | 8 | "B" | 1490.0 | >> | 9 | "B" | 1560.0 | >> | 10 | "B" | 1495.0 | >> | 11 | "C" | 1595.0 | >> | 12 | "C" | 1550.0 | >> | 13 | "C" | 1605.0 | >> | 14 | "C" | 1510.0 | >> | 15 | "C" | 1560.0 | >> | 16 | "D" | 1445.0 | >> | 17 | "D" | 1440.0 | >> | 18 | "D" | 1595.0 | >> | 19 | "D" | 1465.0 | >> | 20 | "D" | 1545.0 | >> | 21 | "E" | 1595.0 | >> | 22 | "E" | 1630.0 | >> | 23 | "E" | 1515.0 | >> | 24 | "E" | 1635.0 | >> | 25 | "E" | 1625.0 | >> | 26 | "F" | 1520.0 | >> | 27 | "F" | 1455.0 | >> | 28 | "F" | 1450.0 | >> | 29 | "F" | 1480.0 | >> | 30 | "F" | 1445.0 | >> >> If I wanted to read a CSV file using the facilities in R I could use >> >> julia> rcopy("read.csv('/usr/share/distro-info/debian.csv')") >> 17x6 DataFrames.DataFrame >> | Row | version | codename | series | created | >> release | eol | >> >> |-----|---------|----------------|----------------|--------------|--------------|--------------| >> | 1 | 1.1 | "Buzz" | "buzz" | "1993-08-16" | >> "1996-06-17" | "1997-06-05" | >> | 2 | 1.2 | "Rex" | "rex" | "1996-06-17" | >> "1996-12-12" | "1998-06-05" | >> | 3 | 1.3 | "Bo" | "bo" | "1996-12-12" | >> "1997-06-05" | "1999-03-09" | >> | 4 | 2.0 | "Hamm" | "hamm" | "1997-06-05" | >> "1998-07-24" | "2000-03-09" | >> | 5 | 2.1 | "Slink" | "slink" | "1998-07-24" | >> "1999-03-09" | "2000-10-30" | >> | 6 | 2.2 | "Potato" | "potato" | "1999-03-09" | >> "2000-08-15" | "2003-07-30" | >> | 7 | 3.0 | "Woody" | "woody" | "2000-08-15" | >> "2002-07-19" | "2006-06-30" | >> | 8 | 3.1 | "Sarge" | "sarge" | "2002-07-19" | >> "2005-06-06" | "2008-03-30" | >> | 9 | 4.0 | "Etch" | "etch" | "2005-06-06" | >> "2007-04-08" | "2010-02-15" | >> | 10 | 5.0 | "Lenny" | "lenny" | "2007-04-08" | >> "2009-02-14" | "2012-02-06" | >> | 11 | 6.0 | "Squeeze" | "squeeze" | "2009-02-14" | >> "2011-02-06" | "2014-05-31" | >> | 12 | 7.0 | "Wheezy" | "wheezy" | "2011-02-06" | >> "2013-05-04" | "" | >> | 13 | 8.0 | "Jessie" | "jessie" | "2013-05-04" | >> "2015-04-25" | "" | >> | 14 | 9.0 | "Stretch" | "stretch" | "2015-04-25" | "" >> | "" | >> | 15 | 10.0 | "Buster" | "buster" | "2018-07-01" | "" >> | "" | >> | 16 | NA | "Sid" | "sid" | "1993-08-16" | "" >> | "" | >> | 17 | NA | "Experimental" | "experimental" | "1993-08-16" | "" >> | "" | >> >> >> (It turns out that R's allowing either ' or " for enclosing strings is an >> advantage for quoting strings within strings.) >> > >