Re: [julia-users] Re: documentation suggestions

Douglas Bates Fri, 12 Feb 2016 09:50:45 -0800

On Thursday, February 11, 2016 at 3:06:45 PM UTC-6, ivo welch wrote:
>
>
> hi doug---and vice-versa.  it's interesting that a core function (reading 
> a .csv file) would not be in a native julia library.  when are you 
> switching your students to julia?  regards,  /iaw
>


Writing a function to read a .csv file is not trivial - partly because CSV 
is not well-defined.  It is also the case of an itch getting scratched - if 
those working on Julia with the skills to write such a function don't have 
a need to read .csv files that particular functionality stagnates.

The definition and functionality of data frames, which are the natural 
output when reading a CSV file,  in Julia is still being debated.  In R the 
choices were much easier because R was designed to emulate S version 3 in 
which a data frame was a central construct.  Sacrifices in performance were 
made to allow for checking for NA's during each atomic arithmetic 
operation.  That trade-off wouldn't fly in Julia.  Also R vector structures 
all allow for element names - again at an expense in performance.

I'm not really in the position to convert my students as I am now an 
Emeritus Professor.  I do still offer a seminar series on "Statistics with 
Julia" and have convinced some students to use Julia in thesis research.

I would be quite happy with Julia if only git and I got along better.  I 
just lost three days worth of work this morning because of yet another git 
disaster.

>
> ----
> Ivo Welch (ivo....@gmail.com <javascript:>)
> http://www.ivo-welch.info/
> J. Fred Weston Distinguished Professor of Finance
> Anderson School at UCLA, C519
> Free Finance Textbook, http://book.ivo-welch.info/
> Exec Editor, Critical Finance Review, 
> http://www.critical-finance-review.org/
> Editor and Publisher, FAMe, http://www.fame-jagazine.com/
>
> On Thu, Feb 11, 2016 at 12:37 PM, Douglas Bates <dmb...@gmail.com 
> <javascript:>> wrote:
>
>> Hi Ivo,
>>
>> Good to hear from you.
>>
>> On Wednesday, February 10, 2016 at 9:58:37 AM UTC-6, ivo welch wrote:
>>>
>>>
>>> ladies and gents---I am not (yet) a julia user.
>>>
>>> may I suggest adding more examples into two places where julia users 
>>> will face starting hurdles?
>>>
>>> [1] the I/O docs of julia.  like, reading and writing csv files that are 
>>> compressed and decompressed on-the-fly, even if not in the ultimate 
>>> efficient manner.    a large fraction of the time and frustration of new 
>>> users is consumed by the task of shoehorning data into and out of new 
>>> computer languages.  with all of R's problem, the ' d <- read.csv("f.csv")' 
>>> and 'd<-read.csv(pipe(paste("gzcat ", fname)))' reduced this entry 
>>> frustration greatly.  perhaps xml file reading and writing.  perhaps...
>>>
>>> [2] more 'standard task' programs would be great.  read a csv file, run 
>>> a regression according to variable names on the command line, print output, 
>>> draw a graph.  I know there are fragments throughout the docs, but some 
>>> section with ready to run complete programs would be good, perhaps at the 
>>> end of the manual.
>>>
>>> in a year, I hope to switch my students from R to julia.
>>>
>>
>> My main use of the RCall package is to import datasets from R into 
>> Julia.  If I have a dataset in an R package I use, e.g.
>>
>>  julia> using RCall
>>
>> julia> ds = rcopy("lme4::Dyestuff")
>> 30x2 DataFrames.DataFrame
>> | Row | Batch | Yield  |
>> |-----|-------|--------|
>> | 1   | "A"   | 1545.0 |
>> | 2   | "A"   | 1440.0 |
>> | 3   | "A"   | 1440.0 |
>> | 4   | "A"   | 1520.0 |
>> | 5   | "A"   | 1580.0 |
>> | 6   | "B"   | 1540.0 |
>> | 7   | "B"   | 1555.0 |
>> | 8   | "B"   | 1490.0 |
>> | 9   | "B"   | 1560.0 |
>> | 10  | "B"   | 1495.0 |
>> | 11  | "C"   | 1595.0 |
>> | 12  | "C"   | 1550.0 |
>> | 13  | "C"   | 1605.0 |
>> | 14  | "C"   | 1510.0 |
>> | 15  | "C"   | 1560.0 |
>> | 16  | "D"   | 1445.0 |
>> | 17  | "D"   | 1440.0 |
>> | 18  | "D"   | 1595.0 |
>> | 19  | "D"   | 1465.0 |
>> | 20  | "D"   | 1545.0 |
>> | 21  | "E"   | 1595.0 |
>> | 22  | "E"   | 1630.0 |
>> | 23  | "E"   | 1515.0 |
>> | 24  | "E"   | 1635.0 |
>> | 25  | "E"   | 1625.0 |
>> | 26  | "F"   | 1520.0 |
>> | 27  | "F"   | 1455.0 |
>> | 28  | "F"   | 1450.0 |
>> | 29  | "F"   | 1480.0 |
>> | 30  | "F"   | 1445.0 |
>>
>> If I wanted to read a CSV file using the facilities in R I could use
>>
>> julia> rcopy("read.csv('/usr/share/distro-info/debian.csv')")
>> 17x6 DataFrames.DataFrame
>> | Row | version | codename       | series         | created      | 
>> release      | eol          |
>>
>> |-----|---------|----------------|----------------|--------------|--------------|--------------|
>> | 1   | 1.1     | "Buzz"         | "buzz"         | "1993-08-16" | 
>> "1996-06-17" | "1997-06-05" |
>> | 2   | 1.2     | "Rex"          | "rex"          | "1996-06-17" | 
>> "1996-12-12" | "1998-06-05" |
>> | 3   | 1.3     | "Bo"           | "bo"           | "1996-12-12" | 
>> "1997-06-05" | "1999-03-09" |
>> | 4   | 2.0     | "Hamm"         | "hamm"         | "1997-06-05" | 
>> "1998-07-24" | "2000-03-09" |
>> | 5   | 2.1     | "Slink"        | "slink"        | "1998-07-24" | 
>> "1999-03-09" | "2000-10-30" |
>> | 6   | 2.2     | "Potato"       | "potato"       | "1999-03-09" | 
>> "2000-08-15" | "2003-07-30" |
>> | 7   | 3.0     | "Woody"        | "woody"        | "2000-08-15" | 
>> "2002-07-19" | "2006-06-30" |
>> | 8   | 3.1     | "Sarge"        | "sarge"        | "2002-07-19" | 
>> "2005-06-06" | "2008-03-30" |
>> | 9   | 4.0     | "Etch"         | "etch"         | "2005-06-06" | 
>> "2007-04-08" | "2010-02-15" |
>> | 10  | 5.0     | "Lenny"        | "lenny"        | "2007-04-08" | 
>> "2009-02-14" | "2012-02-06" |
>> | 11  | 6.0     | "Squeeze"      | "squeeze"      | "2009-02-14" | 
>> "2011-02-06" | "2014-05-31" |
>> | 12  | 7.0     | "Wheezy"       | "wheezy"       | "2011-02-06" | 
>> "2013-05-04" | ""           |
>> | 13  | 8.0     | "Jessie"       | "jessie"       | "2013-05-04" | 
>> "2015-04-25" | ""           |
>> | 14  | 9.0     | "Stretch"      | "stretch"      | "2015-04-25" | ""     
>>       | ""           |
>> | 15  | 10.0    | "Buster"       | "buster"       | "2018-07-01" | ""     
>>       | ""           |
>> | 16  | NA      | "Sid"          | "sid"          | "1993-08-16" | ""     
>>       | ""           |
>> | 17  | NA      | "Experimental" | "experimental" | "1993-08-16" | ""     
>>       | ""           |
>>
>>
>> (It turns out that R's allowing either ' or " for enclosing strings is an 
>> advantage for quoting strings within strings.)
>>
>
>

Re: [julia-users] Re: documentation suggestions

Reply via email to