subject:"\[Rd\] Request\: Suggestions for good teaching packages, esp. with C code"

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-16 Thread Martin Becker


On 15.02.2011 22:48, David Scott wrote:

On 16/02/2011 7:04 a.m., Paul Johnson wrote:

...

4. We don't want gratuitous use of return at the end of functions.
Why do people still do that?


Well I for one (and Jeff as well it seems) think it is good 
programming practice. It makes explicit what is being returned 
eliminating the possibility of mistakes and provides clarity for 
anyone reading the code.


David Scott




AFAIR (but I am not sure, maybe some expert can comment on this), there 
is a difference between using return and not using return when R code is 
called from C-code via eval(). If my memory is correct, a return() 
statement (in the R code) would abort the C function (which is trying to 
evaluate the R code, e.g., the body of a function) as well, which is 
probably not intended. So, the use of return() in R code may be quite 
disadvantageous in certain situations.


Martin

--
Dr. Martin Becker
Statistics and Econometrics
Saarland University
Campus C3 1, Room 217
66123 Saarbruecken
Germany

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-16 Thread Duncan Murdoch


On 11-02-16 7:31 AM, Martin Becker wrote:

On 15.02.2011 22:48, David Scott wrote:

On 16/02/2011 7:04 a.m., Paul Johnson wrote:

...

4. We don't want gratuitous use of return at the end of functions.
Why do people still do that?


Well I for one (and Jeff as well it seems) think it is good
programming practice. It makes explicit what is being returned
eliminating the possibility of mistakes and provides clarity for
anyone reading the code.

David Scott




AFAIR (but I am not sure, maybe some expert can comment on this), there
is a difference between using return and not using return when R code is
called from C-code via eval(). If my memory is correct, a return()
statement (in the R code) would abort the C function (which is trying to
evaluate the R code, e.g., the body of a function) as well, which is
probably not intended. So, the use of return() in R code may be quite
disadvantageous in certain situations.


As far as I know there is no such effect.  I suspect what you saw just 
triggered a bug in the C code that had stayed hidden before.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-16 Thread luke-tierney


If you evaluate return(x) in an evironment env then then that will
execute a return from the function call associated with env or signal
an error if there is none.  That is the way return() is intended to
work.

Best,

luke

On Wed, 16 Feb 2011, Duncan Murdoch wrote:


On 11-02-16 7:31 AM, Martin Becker wrote:

On 15.02.2011 22:48, David Scott wrote:

On 16/02/2011 7:04 a.m., Paul Johnson wrote:

...

4. We don't want gratuitous use of return at the end of functions.
Why do people still do that?


Well I for one (and Jeff as well it seems) think it is good
programming practice. It makes explicit what is being returned
eliminating the possibility of mistakes and provides clarity for
anyone reading the code.

David Scott




AFAIR (but I am not sure, maybe some expert can comment on this), there
is a difference between using return and not using return when R code is
called from C-code via eval(). If my memory is correct, a return()
statement (in the R code) would abort the C function (which is trying to
evaluate the R code, e.g., the body of a function) as well, which is
probably not intended. So, the use of return() in R code may be quite
disadvantageous in certain situations.


As far as I know there is no such effect.  I suspect what you saw just 
triggered a bug in the C code that had stayed hidden before.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:  l...@stat.uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-16 Thread Martin Becker


Luke,

thanks for your explanation.
I now remember that I was indeed getting an error (instead of a silent 
abort) because I did something comparable to a .Call() to lapply in 
section 5.11 of WRE (Writing R extensions) where expr was the body of a 
function f (literally) which contained a return()-statement. Although 
removing the return()-statement solved my problem a few years ago, I now 
know that I had better followed the next example of WRE (lapply2) 
which is especially designed for evaluating function calls (instead of 
expressions).
So, sorry for the noise (and for blaming return()) and thanks again for 
the clarification.


Best,

  Martin

On 16.02.2011 17:11, luke-tier...@uiowa.edu wrote:

If you evaluate return(x) in an evironment env then then that will
execute a return from the function call associated with env or signal
an error if there is none.  That is the way return() is intended to
work.

Best,

luke

On Wed, 16 Feb 2011, Duncan Murdoch wrote:


On 11-02-16 7:31 AM, Martin Becker wrote:

On 15.02.2011 22:48, David Scott wrote:

On 16/02/2011 7:04 a.m., Paul Johnson wrote:

...

4. We don't want gratuitous use of return at the end of functions.
Why do people still do that?


Well I for one (and Jeff as well it seems) think it is good
programming practice. It makes explicit what is being returned
eliminating the possibility of mistakes and provides clarity for
anyone reading the code.

David Scott




AFAIR (but I am not sure, maybe some expert can comment on this), there
is a difference between using return and not using return when R 
code is

called from C-code via eval(). If my memory is correct, a return()
statement (in the R code) would abort the C function (which is 
trying to

evaluate the R code, e.g., the body of a function) as well, which is
probably not intended. So, the use of return() in R code may be quite
disadvantageous in certain situations.


As far as I know there is no such effect.  I suspect what you saw 
just triggered a bug in the C code that had stayed hidden before.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






--
Dr. Martin Becker
Statistics and Econometrics
Saarland University
Campus C3 1, Room 217
66123 Saarbruecken
Germany

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Paul Johnson

Hello,

I am looking for CRAN packages that don't teach bad habits.  Can I
have suggestions?

I don't mean the recommended packages that come with R, I mean the
contributed ones.  I've been sampling a lot of examples and am
surprised that many ignore seemingly agreed-upon principles of R
coding. In r-devel, almost everyone seems to support the functional
programming theme in Chambers's book on Software For Data Analysis,
but when I go look at randomly selected packages, programmers don't
follow that advice.

In particular:

1. Functions must avoid mystery variables from nowhere.

Consider a function's code, it should not be necessary to say what's
variable X? and go hunting in the commands that lead up to the
function call.  If X is used in the function, it should be in a named
argument, or extracted from one of the named arguments.  People who
rely on variables floating around in the user's environment are
creating hard-to-find bugs.

2. We don't want functions with indirect effects (no - ), almost always.

3. Code should be vectorized where possible, C style for loops over
vector members should be avoided.

4. We don't want gratuitous use of return at the end of functions.
Why do people still do that?

5. Neatness counts.  Code should look nice!  Check out how beautiful
the functions in MASS look! I want code with spaces and  -  rather
than  everything jammed together with =.

I don't mean to criticize any particular person's code in raising this
point.  For teaching exemples, where to focus?

Here's one candidate I've found:

MNP.  as far as I can tell, it meets the first 4 requirements.  And it
has some very clear C code with it as well. I'm only hesitant there
because I'm not entirely sure that a package's C code should introduce
its own functions for handling vectors and matrices, when some general
purpose library might be more desirable.  But that's a small point,
and clarity and completeness counts a great deal in my opinion.





-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Hadley Wickham

I think my recent packages are pretty good. In particular, I'd
recommend string, plyr and testthat as being well written, well
documented and (somewhat) well tested.  I've also been trying to write
up the process of writing good packages.  See
https://github.com/hadley/devtools/wiki for my thoughts so far.

Hadley

On Tue, Feb 15, 2011 at 6:04 PM, Paul Johnson pauljoh...@gmail.com wrote:
 Hello,

 I am looking for CRAN packages that don't teach bad habits.  Can I
 have suggestions?

 I don't mean the recommended packages that come with R, I mean the
 contributed ones.  I've been sampling a lot of examples and am
 surprised that many ignore seemingly agreed-upon principles of R
 coding. In r-devel, almost everyone seems to support the functional
 programming theme in Chambers's book on Software For Data Analysis,
 but when I go look at randomly selected packages, programmers don't
 follow that advice.

 In particular:

 1. Functions must avoid mystery variables from nowhere.

 Consider a function's code, it should not be necessary to say what's
 variable X? and go hunting in the commands that lead up to the
 function call.  If X is used in the function, it should be in a named
 argument, or extracted from one of the named arguments.  People who
 rely on variables floating around in the user's environment are
 creating hard-to-find bugs.

 2. We don't want functions with indirect effects (no - ), almost always.

 3. Code should be vectorized where possible, C style for loops over
 vector members should be avoided.

 4. We don't want gratuitous use of return at the end of functions.
 Why do people still do that?

 5. Neatness counts.  Code should look nice!  Check out how beautiful
 the functions in MASS look! I want code with spaces and  -  rather
 than  everything jammed together with =.

 I don't mean to criticize any particular person's code in raising this
 point.  For teaching exemples, where to focus?

 Here's one candidate I've found:

 MNP.  as far as I can tell, it meets the first 4 requirements.  And it
 has some very clear C code with it as well. I'm only hesitant there
 because I'm not entirely sure that a package's C code should introduce
 its own functions for handling vectors and matrices, when some general
 purpose library might be more desirable.  But that's a small point,
 and clarity and completeness counts a great deal in my opinion.





 --
 Paul E. Johnson
 Professor, Political Science
 1541 Lilac Lane, Room 504
 University of Kansas

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel




-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Jeffrey Ryan

I think for teaching, you need to use R itself.

Everything else is going to be a derivative from that, and if you are
looking for 'correctness' or 'consistency' with the spirit of R, you
can only be disappointed - as everyone will take liberties or bring
personal style into the equation.

In addition, your points are debatable in terms of priority/value.
e.g. what is wrong with 'return'?  Certainly provides clarity and
consistency if you have if-else constructs.

We've all learned from reading R sources, and it seems to have worked
out well for many of us.

Jeff


On Tue, Feb 15, 2011 at 12:04 PM, Paul Johnson pauljoh...@gmail.com wrote:
 Hello,

 I am looking for CRAN packages that don't teach bad habits.  Can I
 have suggestions?

 I don't mean the recommended packages that come with R, I mean the
 contributed ones.  I've been sampling a lot of examples and am
 surprised that many ignore seemingly agreed-upon principles of R
 coding. In r-devel, almost everyone seems to support the functional
 programming theme in Chambers's book on Software For Data Analysis,
 but when I go look at randomly selected packages, programmers don't
 follow that advice.

 In particular:

 1. Functions must avoid mystery variables from nowhere.

 Consider a function's code, it should not be necessary to say what's
 variable X? and go hunting in the commands that lead up to the
 function call.  If X is used in the function, it should be in a named
 argument, or extracted from one of the named arguments.  People who
 rely on variables floating around in the user's environment are
 creating hard-to-find bugs.

 2. We don't want functions with indirect effects (no - ), almost always.

 3. Code should be vectorized where possible, C style for loops over
 vector members should be avoided.

 4. We don't want gratuitous use of return at the end of functions.
 Why do people still do that?

 5. Neatness counts.  Code should look nice!  Check out how beautiful
 the functions in MASS look! I want code with spaces and  -  rather
 than  everything jammed together with =.

 I don't mean to criticize any particular person's code in raising this
 point.  For teaching exemples, where to focus?

 Here's one candidate I've found:

 MNP.  as far as I can tell, it meets the first 4 requirements.  And it
 has some very clear C code with it as well. I'm only hesitant there
 because I'm not entirely sure that a package's C code should introduce
 its own functions for handling vectors and matrices, when some general
 purpose library might be more desirable.  But that's a small point,
 and clarity and completeness counts a great deal in my opinion.





 --
 Paul E. Johnson
 Professor, Political Science
 1541 Lilac Lane, Room 504
 University of Kansas

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel




-- 
Jeffrey Ryan
jeffrey.r...@lemnica.com

www.lemnica.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Gabor Grothendieck

On Tue, Feb 15, 2011 at 1:04 PM, Paul Johnson pauljoh...@gmail.com wrote:
 Hello,

 I am looking for CRAN packages that don't teach bad habits.  Can I
 have suggestions?

 I don't mean the recommended packages that come with R, I mean the
 contributed ones.  I've been sampling a lot of examples and am
 surprised that many ignore seemingly agreed-upon principles of R
 coding. In r-devel, almost everyone seems to support the functional
 programming theme in Chambers's book on Software For Data Analysis,
 but when I go look at randomly selected packages, programmers don't
 follow that advice.

 In particular:

 1. Functions must avoid mystery variables from nowhere.

 Consider a function's code, it should not be necessary to say what's
 variable X? and go hunting in the commands that lead up to the
 function call.  If X is used in the function, it should be in a named
 argument, or extracted from one of the named arguments.  People who
 rely on variables floating around in the user's environment are
 creating hard-to-find bugs.

 2. We don't want functions with indirect effects (no - ), almost always.

 3. Code should be vectorized where possible, C style for loops over
 vector members should be avoided.

 4. We don't want gratuitous use of return at the end of functions.
 Why do people still do that?

 5. Neatness counts.  Code should look nice!  Check out how beautiful
 the functions in MASS look! I want code with spaces and  -  rather
 than  everything jammed together with =.

 I don't mean to criticize any particular person's code in raising this
 point.  For teaching exemples, where to focus?

 Here's one candidate I've found:

 MNP.  as far as I can tell, it meets the first 4 requirements.  And it
 has some very clear C code with it as well. I'm only hesitant there
 because I'm not entirely sure that a package's C code should introduce
 its own functions for handling vectors and matrices, when some general
 purpose library might be more desirable.  But that's a small point,
 and clarity and completeness counts a great deal in my opinion.


There was some discussion of this on stats stackexchange

http://stats.stackexchange.com/questions/5418/first-r-packages-source-code-to-study-in-preparation-for-writing-own-package

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Sebastian P. Luque

Hi Paul,

You might want to post this to the teaching list (R-sig-teaching).  I'd
look at packages written by old-timers and R Core.  I've also found that
most Bioconductor packages follow the guidelines you mention and many
other excellent habits very well.  I agree with you that these are very
important things to teach.

Seb



On Tue, 15 Feb 2011 12:04:42 -0600,
Paul Johnson pauljoh...@gmail.com wrote:

 Hello, I am looking for CRAN packages that don't teach bad habits.
 Can I have suggestions?

 I don't mean the recommended packages that come with R, I mean the
 contributed ones.  I've been sampling a lot of examples and am
 surprised that many ignore seemingly agreed-upon principles of R
 coding. In r-devel, almost everyone seems to support the functional
 programming theme in Chambers's book on Software For Data Analysis,
 but when I go look at randomly selected packages, programmers don't
 follow that advice.

 In particular:

 1. Functions must avoid mystery variables from nowhere.

 Consider a function's code, it should not be necessary to say what's
 variable X? and go hunting in the commands that lead up to the
 function call.  If X is used in the function, it should be in a named
 argument, or extracted from one of the named arguments.  People who
 rely on variables floating around in the user's environment are
 creating hard-to-find bugs.

 2. We don't want functions with indirect effects (no - ), almost
 always.

 3. Code should be vectorized where possible, C style for loops over
 vector members should be avoided.

 4. We don't want gratuitous use of return at the end of functions.
 Why do people still do that?

 5. Neatness counts.  Code should look nice!  Check out how beautiful
 the functions in MASS look! I want code with spaces and  -  rather
 than everything jammed together with =.

 I don't mean to criticize any particular person's code in raising this
 point.  For teaching exemples, where to focus?

 Here's one candidate I've found:

 MNP.  as far as I can tell, it meets the first 4 requirements.  And it
 has some very clear C code with it as well. I'm only hesitant there
 because I'm not entirely sure that a package's C code should introduce
 its own functions for handling vectors and matrices, when some general
 purpose library might be more desirable.  But that's a small point,
 and clarity and completeness counts a great deal in my opinion.


-- 
Seb

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread David Scott


On 16/02/2011 7:04 a.m., Paul Johnson wrote:

Hello,

I am looking for CRAN packages that don't teach bad habits.  Can I
have suggestions?

I don't mean the recommended packages that come with R, I mean the
contributed ones.  I've been sampling a lot of examples and am
surprised that many ignore seemingly agreed-upon principles of R
coding. In r-devel, almost everyone seems to support the functional
programming theme in Chambers's book on Software For Data Analysis,
but when I go look at randomly selected packages, programmers don't
follow that advice.

In particular:

1. Functions must avoid mystery variables from nowhere.

Consider a function's code, it should not be necessary to say what's
variable X? and go hunting in the commands that lead up to the
function call.  If X is used in the function, it should be in a named
argument, or extracted from one of the named arguments.  People who
rely on variables floating around in the user's environment are
creating hard-to-find bugs.

2. We don't want functions with indirect effects (no- ), almost always.

3. Code should be vectorized where possible, C style for loops over
vector members should be avoided.

4. We don't want gratuitous use of return at the end of functions.
Why do people still do that?


Well I for one (and Jeff as well it seems) think it is good programming 
practice. It makes explicit what is being returned eliminating the 
possibility of mistakes and provides clarity for anyone reading the code.


David Scott



5. Neatness counts.  Code should look nice!  Check out how beautiful
the functions in MASS look! I want code with spaces and -  rather
than  everything jammed together with =.

I don't mean to criticize any particular person's code in raising this
point.  For teaching exemples, where to focus?

Here's one candidate I've found:

MNP.  as far as I can tell, it meets the first 4 requirements.  And it
has some very clear C code with it as well. I'm only hesitant there
because I'm not entirely sure that a package's C code should introduce
its own functions for handling vectors and matrices, when some general
purpose library might be more desirable.  But that's a small point,
and clarity and completeness counts a great deal in my opinion.








--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Geoff Jentry


On Wed, 16 Feb 2011, David Scott wrote:

4. We don't want gratuitous use of return at the end of functions.
Why do people still do that?
Well I for one (and Jeff as well it seems) think it is good programming 
practice. It makes explicit what is being returned eliminating the 
possibility of mistakes and provides clarity for anyone reading the code.


You're unnecessarily adding the overhead of a function call by explicitly 
calling return().


Sure it seems odd for someone coming from the C/C++/Java/etc world, but 
anyone familiar with R should find code that doesn't have an explicit 
return() call to be fully readable  clear.


-J

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Kevin Wright

For those of you familiar with R, here's a little quiz.  What what's the
difference between:


f1 - function(){
  a=5
}
f1()

f2 - function(){
  return(a=5)
}
f2()


Kevin Wright




On Tue, Feb 15, 2011 at 3:55 PM, Geoff Jentry geoffjen...@hexdump.orgwrote:

 On Wed, 16 Feb 2011, David Scott wrote:

 4. We don't want gratuitous use of return at the end of functions.
 Why do people still do that?

 Well I for one (and Jeff as well it seems) think it is good programming
 practice. It makes explicit what is being returned eliminating the
 possibility of mistakes and provides clarity for anyone reading the code.


 You're unnecessarily adding the overhead of a function call by explicitly
 calling return().

 Sure it seems odd for someone coming from the C/C++/Java/etc world, but
 anyone familiar with R should find code that doesn't have an explicit
 return() call to be fully readable  clear.

 -J


 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Geoff Jentry


f1 - function(){
 a=5
}


The primary difference is that function 1 uses an incorrect assignment 
operator in an attempt to cause confusion ;)


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Jeffrey Ryan

f3 - function() {
  ( a - 5 )
}

f4 - function() {
  a - 5
  a
}

On my machine f1,f2, and f4 all perform approx. the same.  The () in
f3 adds about 20% overhead.

Jeff

On Tue, Feb 15, 2011 at 4:22 PM, Kevin Wright kw.s...@gmail.com wrote:
 For those of you familiar with R, here's a little quiz.  What what's the
 difference between:


 f1 - function(){
  a=5
 }
 f1()

 f2 - function(){
  return(a=5)
 }
 f2()


 Kevin Wright




 On Tue, Feb 15, 2011 at 3:55 PM, Geoff Jentry geoffjen...@hexdump.orgwrote:

 On Wed, 16 Feb 2011, David Scott wrote:

 4. We don't want gratuitous use of return at the end of functions.
 Why do people still do that?

 Well I for one (and Jeff as well it seems) think it is good programming
 practice. It makes explicit what is being returned eliminating the
 possibility of mistakes and provides clarity for anyone reading the code.


 You're unnecessarily adding the overhead of a function call by explicitly
 calling return().

 Sure it seems odd for someone coming from the C/C++/Java/etc world, but
 anyone familiar with R should find code that doesn't have an explicit
 return() call to be fully readable  clear.

 -J


 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


        [[alternative HTML version deleted]]

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel




-- 
Jeffrey Ryan
jeffrey.r...@lemnica.com

www.lemnica.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Gabor Grothendieck

On Tue, Feb 15, 2011 at 4:48 PM, David Scott d.sc...@auckland.ac.nz wrote:
 On 16/02/2011 7:04 a.m., Paul Johnson wrote:

 Hello,

 I am looking for CRAN packages that don't teach bad habits.  Can I
 have suggestions?

 I don't mean the recommended packages that come with R, I mean the
 contributed ones.  I've been sampling a lot of examples and am
 surprised that many ignore seemingly agreed-upon principles of R
 coding. In r-devel, almost everyone seems to support the functional
 programming theme in Chambers's book on Software For Data Analysis,
 but when I go look at randomly selected packages, programmers don't
 follow that advice.

 In particular:

 1. Functions must avoid mystery variables from nowhere.

 Consider a function's code, it should not be necessary to say what's
 variable X? and go hunting in the commands that lead up to the
 function call.  If X is used in the function, it should be in a named
 argument, or extracted from one of the named arguments.  People who
 rely on variables floating around in the user's environment are
 creating hard-to-find bugs.

 2. We don't want functions with indirect effects (no- ), almost always.

 3. Code should be vectorized where possible, C style for loops over
 vector members should be avoided.

 4. We don't want gratuitous use of return at the end of functions.
 Why do people still do that?

 Well I for one (and Jeff as well it seems) think it is good programming
 practice. It makes explicit what is being returned eliminating the
 possibility of mistakes and provides clarity for anyone reading the code.


I think the real good programming practice is to have a single point
of exit at the bottom.   If that is how you program all your functions
then you don't need to explicitly put a return in since it always
returns from the bottom anyways and the return would just clutter your
code.

Sometimes the single point of exit at the bottom is a soft rule in
which the rule is encouraged but if there is significant code
expansion on that account then the rule is broken.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Ken.Williams


On 2/15/11 4:35 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote:

I think the real good programming practice is to have a single point
of exit at the bottom.

I disagree, it can be extremely useful to exit early from a function.  It
can also make the code much more clear by not having 95% of the body in a
huge else{} block.


If that is how you program all your functions
then you don't need to explicitly put a return in since it always
returns from the bottom anyways and the return would just clutter your
code.

For someone else reading your code, they wouldn't know that you always do
this unless they're very familiar with your coding style.  Even then, it
needs to be manually checked by inspection because nobody sticks with the
rule 100% of the time, so it renders the benefit moot.

--
Ken Williams
Senior Research Scientist
Thomson Reuters
Phone: 651-848-7712
ken.willi...@thomsonreuters.com
http://labs.thomsonreuters.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread David Scott


On 16/02/2011 11:43 a.m., ken.willi...@thomsonreuters.com wrote:


On 2/15/11 4:35 PM, Gabor Grothendieckggrothendi...@gmail.com  wrote:


I think the real good programming practice is to have a single point
of exit at the bottom.


I disagree, it can be extremely useful to exit early from a function.  It
can also make the code much more clear by not having 95% of the body in a
huge else{} block.



If that is how you program all your functions
then you don't need to explicitly put a return in since it always
returns from the bottom anyways and the return would just clutter your
code.


For someone else reading your code, they wouldn't know that you always do
this unless they're very familiar with your coding style.  Even then, it
needs to be manually checked by inspection because nobody sticks with the
rule 100% of the time, so it renders the benefit moot.

--
Ken Williams
Senior Research Scientist
Thomson Reuters
Phone: 651-848-7712
ken.willi...@thomsonreuters.com
http://labs.thomsonreuters.com



Some interesting discussion on this point. Enlightening for me at least.

A quick test showed me that an explicit return does produce about a 20% 
time hit in a one-line function (obviously a lesser % in a non-trivial 
function) but enough to convince me not to use an explicit return in 
functions where what is being returned is obvious.


Gabor's point is a good one, there *should* be a single exit point at 
the bottom, but I have certainly had situations where an early exit 
seems preferable as Ken suggests. Then an explicit return may make the 
code sufficiently clear for a violation of Gabor's principle to be 
acceptable.


David Scott






--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Gabor Grothendieck

On Tue, Feb 15, 2011 at 5:43 PM,  ken.willi...@thomsonreuters.com wrote:

 On 2/15/11 4:35 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote:

I think the real good programming practice is to have a single point
of exit at the bottom.

 I disagree, it can be extremely useful to exit early from a function.  It
 can also make the code much more clear by not having 95% of the body in a
 huge else{} block.


If that is the case then the routines may be too large.  One of the
purposes of this widely practiced principle is to encourage
modularity.

Also achieving code coverage can be simplified when using single point
of return rather than multiple points of return.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Duncan Murdoch


On 15/02/2011 5:22 PM, Kevin Wright wrote:

For those of you familiar with R, here's a little quiz.  What what's the
difference between:


f1- function(){
   a=5
}


This returns 5, invisibly.  It's also bad style, according to those of 
us who prefer - to = for assignment.



f2- function(){
   return(a=5)
}


This is a mistake:  return() doesn't take named arguments.  It is 
lenient and lets you get away with this error (treating it the same as

return(5)), and returns the 5, visibly.

Duncan Murdoch


f2()


Kevin Wright




On Tue, Feb 15, 2011 at 3:55 PM, Geoff Jentrygeoffjen...@hexdump.orgwrote:


On Wed, 16 Feb 2011, David Scott wrote:


4. We don't want gratuitous use of return at the end of functions.

Why do people still do that?


Well I for one (and Jeff as well it seems) think it is good programming
practice. It makes explicit what is being returned eliminating the
possibility of mistakes and provides clarity for anyone reading the code.



You're unnecessarily adding the overhead of a function call by explicitly
calling return().

Sure it seems odd for someone coming from the C/C++/Java/etc world, but
anyone familiar with R should find code that doesn't have an explicit
return() call to be fully readable  clear.

-J


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Steven McKinney

 -Original Message-
 From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On 
 Behalf Of Duncan Murdoch
 Sent: February-15-11 3:10 PM
 To: Kevin Wright
 Cc: R Devel List
 Subject: Re: [Rd] Request: Suggestions for good teaching packages, esp. 
 with C code

 On 15/02/2011 5:22 PM, Kevin Wright wrote:
  For those of you familiar with R, here's a little quiz.  What what's the
  difference between:

  f1- function(){
 a=5
  }

 This returns 5, invisibly.  It's also bad style, according to those of
 us who prefer - to = for assignment.

For maximum clarity

f0 - function() {
b - 5
return( list( a = b ) )
}

 f0()
$a
[1] 5

Steven McKinney

  f2- function(){
 return(a=5)
  }

 This is a mistake:  return() doesn't take named arguments.  It is
 lenient and lets you get away with this error (treating it the same as
 return(5)), and returns the 5, visibly.

 Duncan Murdoch

  f2()

  Kevin Wright

  On Tue, Feb 15, 2011 at 3:55 PM, Geoff Jentrygeoffjen...@hexdump.orgwrote:

  On Wed, 16 Feb 2011, David Scott wrote:

  4. We don't want gratuitous use of return at the end of functions.
  Why do people still do that?

  Well I for one (and Jeff as well it seems) think it is good programming
  practice. It makes explicit what is being returned eliminating the
  possibility of mistakes and provides clarity for anyone reading the code.

  You're unnecessarily adding the overhead of a function call by explicitly
  calling return().

  Sure it seems odd for someone coming from the C/C++/Java/etc world, but
  anyone familiar with R should find code that doesn't have an explicit
  return() call to be fully readable  clear.

  -J

  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel

  [[alternative HTML version deleted]]

  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

2011-02-15 Thread Ted Byers

From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org]
On Behalf Of Gabor Grothendieck
Sent: February-15-11 6:10 PM
On Tue, Feb 15, 2011 at 5:43 PM,  ken.willi...@thomsonreuters.com wrote:

 On 2/15/11 4:35 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote:

I think the real good programming practice is to have a single point 
of exit at the bottom.

NB: I am drawing on my experience with C++ and Java, as I have 10x as much
experience  with them as I do with R)

It  is often not practicable to use a single point of exit.  I routinely
have checked all the requirements/assumptions of my code at the beginning,
to ensure error conditions do not arise once the code that does the real
work gets started.  That means that there is as least one exit point between
the beginning of my checks and the beginning of my code that is doing the
real work; often more.  These exits generally include construction of an
error condition object with the details of what the error is and why it
happened. (but that is my high performance C++ code, and Gui code written in
Java).

 I disagree, it can be extremely useful to exit early from a function.  
 It can also make the code much more clear by not having 95% of the 
 body in a huge else{} block.


If that is the case then the routines may be too large.  One of the
purposes of this widely practiced principle is to encourage modularity.

This I'd agree with, to an extent.  I routinely try to keep my functions
short enough to be able to see the whole thing without scrolling.  This
means I break large tasks into a number of small ones, implemented in
functions that can be inlined.And of course, such small functions make
writing complex conditional blocks much easier and it makes them much more
readable.  Thus, if you looked at my C++ code, you'd find a large number of
smaller functions with a single exit, and a small, but significant, portion
of my functions are a bit longer with multiple exits.

Also achieving code coverage can be simplified when using single point of
return rather than multiple points of return.

This is an issue only if your code is badly designed spaghetti code.if
your function is that long, it will be a nightmare to write decent unit test
that test all possible paths through the code, let alone those tests
required to verify that the result it produces is correct.  But if you have
ensured that all your functions can be viewed on your screen without
scrolling, it is easy to see all exit points, and write unit tests for each.
The functions that test for conditions that can produce errors often form
the basis of the unit tests needed for testing every possible exit point
(basically killing two birds with one stone).  This is relatively simple if
handled right, with a good eye for detail.

One of the things I would point out is that such generalities can be useful
in introducing young people to programming, but it is wise not to be too
dogmatic or generalize too widely.

Cheers

Ted

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

[Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

Re: [Rd] Request: Suggestions for good teaching packages, esp. with C code

21 matches

Site Navigation

Mail list logo

Footer information