Re: [Rd] :: and ::: as .Primitives?

2015-01-22 Thread William Dunlap
> if I want to use foo::bar many times, say
> in a loop, I would do
>
> foo_bar <- foo::bar
>
> and use foo_bar, or something along those lines.

The foreach package does that with a function from the compiler package,
so that foreach can work on old version of R:
  comp <- if (getRversion() < "2.13.0") {
function(expr, ...) expr
  } else {
compiler::compile
  }
This results in foreach having its own copy of compiler::compile, with
namespace "compiler", but copied from the version of package:compile
existing on the machine that built the binary of foreach.  If you later
install
an updated version of the compiler package, then foreach still uses the old
compiler::compile, which may not work with the private functions in
the new version of package:compiler.

Making :: faster would not fix this particular problem (making 'comp' a
function that contained the if(getRVersion...) code would), but things
like this could cause problems when more people put 'myFunc <-
otherPackage::Func'
in their packages.




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Jan 22, 2015 at 11:44 AM,  wrote:

> I'm not convinced that how to make :: faster is the right question. If
> you are finding foo::bar being called often enough to matter to your
> overall performance then to me the question is: why are you calling
> foo::bar more than once? Making :: a bit faster by making it a
> primitive will remove some overhead, but your are still left with a
> lot of work that shouldn't need to happen more than once.
>
> For default methods there ought to be a way to create those so the
> default method is computed at creation or load time and stored in an
> environment. For other cases if I want to use foo::bar many times, say
> in a loop, I would do
>
> foo_bar <- foo::bar
>
> and use foo_bar, or something along those lines.
>
> When :: and ::: were introduce they were intended primarily for
> reflection and debugging, so speed was not an issue. ::: is still
> really only reliably usable that way, and making it faster may just
> encourage bad practice. :: is different and there are good arguments
> for using it in code, but I'm not yet seeing good arguments for use in
> ways that would be performance-critical, but I'm happy to be convinced
> otherwise. If there is a need for a faster :: then going to a
> SPECIALSXP is fine; it would also be good to make the byte code
> compiler aware of it, and possibly to work on ways to improve the
> performance further e.g. through cacheing.
>
> Best,
>
> luke
>
>
> On Thu, 22 Jan 2015, Peter Haverty wrote:
>
>
>  Hi all,
>>
>> When S4 methods are defined on base function (say, "match"), the
>> function becomes a method with the body "base::match(x,y)". A call to
>> such a function often spends more time doing "::" than in the function
>> itself.  I always assumed that "::" was a very low-level thing, but it
>> turns out to be a plain old function defined in base/R/namespace.R.
>> What would you all think about making "::" and ":::" .Primitives?  I
>> have submitted some examples, timings, and a patch to the R bug
>> tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134).
>> I'd be very interested to hear your thoughts on the matter.
>>
>> Regards,
>> Pete
>>
>> 
>> Peter M. Haverty, Ph.D.
>> Genentech, Inc.
>> phave...@gene.com
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
>Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] :: and ::: as .Primitives?

2015-01-22 Thread Peter Haverty
Hi all,

I use Luke's "::" hoisting trick often. I think it would be fantastic
if the JIT just did that for you.

The main trouble, for me, is in code I don't own.  When common
Bioconductor packages are loaded many, many base functions are saddled
with this substantial dispatch and "::" overhead.

While we have the hood up, the parser could help out a bit here too.
It already has special cases for "::" and ":::". Currently you get the
symbols "pkg" and "name" and have to go fishing in the calling
environment for the associated values.  It would be nice to have the
parser or JIT rewrite base::match as doubleColon("base","match") or
directly provide the symbols "base" and "match" to the subsequent
code.

I think it's also kind of entertaining that the comments in
base/R/namespace.R note that they are using ":::" for speed purposes
only.
Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com


On Thu, Jan 22, 2015 at 12:54 PM, Michael Lawrence
 wrote:
> On Thu, Jan 22, 2015 at 11:44 AM,   wrote:
>>
>> For default methods there ought to be a way to create those so the
>> default method is computed at creation or load time and stored in an
>> environment.
>
> We had considered that, but we thought the definition of the function
> would be easier to interpret if it explicitly specified the namespace,
> instead of using tricks with environments. The same applies for
> memoizing the lookup in front of a loop.
>
> The implementation of these functions is almost simpler in C than it
> is in R, so there is relatively little risk to this change. But I
> agree the benefits are also somewhat minor.
>
>> For other cases if I want to use foo::bar many times, say
>> in a loop, I would do
>>
>> foo_bar <- foo::bar
>>
>> and use foo_bar, or something along those lines.
>>
>> When :: and ::: were introduce they were intended primarily for
>> reflection and debugging, so speed was not an issue. ::: is still
>> really only reliably usable that way, and making it faster may just
>> encourage bad practice. :: is different and there are good arguments
>> for using it in code, but I'm not yet seeing good arguments for use in
>> ways that would be performance-critical, but I'm happy to be convinced
>> otherwise. If there is a need for a faster :: then going to a
>> SPECIALSXP is fine; it would also be good to make the byte code
>> compiler aware of it, and possibly to work on ways to improve the
>> performance further e.g. through cacheing.
>>
>> Best,
>>
>> luke
>>
>>
>> On Thu, 22 Jan 2015, Peter Haverty wrote:
>>
>>
>>> Hi all,
>>>
>>> When S4 methods are defined on base function (say, "match"), the
>>> function becomes a method with the body "base::match(x,y)". A call to
>>> such a function often spends more time doing "::" than in the function
>>> itself.  I always assumed that "::" was a very low-level thing, but it
>>> turns out to be a plain old function defined in base/R/namespace.R.
>>> What would you all think about making "::" and ":::" .Primitives?  I
>>> have submitted some examples, timings, and a patch to the R bug
>>> tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134).
>>> I'd be very interested to hear your thoughts on the matter.
>>>
>>> Regards,
>>> Pete
>>>
>>> 
>>> Peter M. Haverty, Ph.D.
>>> Genentech, Inc.
>>> phave...@gene.com
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> --
>> Luke Tierney
>> Ralph E. Wareham Professor of Mathematical Sciences
>> University of Iowa  Phone: 319-335-3386
>> Department of Statistics andFax:   319-335-3017
>>Actuarial Science
>> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
>> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>>
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Programming Tools CTV

2015-01-22 Thread Achim Zeileis

On Thu, 22 Jan 2015, Max Kuhn wrote:


On Thu, Jan 22, 2015 at 1:05 PM, Achim Zeileis  wrote:

On Thu, 22 Jan 2015, Max Kuhn wrote:


On Thu, Jan 22, 2015 at 12:45 PM, Achim Zeileis
 wrote:


On Thu, 22 Jan 2015, Max Kuhn wrote:


I've had a lot of requests for additions to the reproducible research
task view that fall into a grey area (to me at least).

For example, roxygen2 is a tool that broadly enable reproducibility
but I see it more as a tool for better programming. I'm about to check
in a new version of the task view that includes packrat and
checkpoint, as they seem closer to reproducible research, but also
feel like coding tools.

There are a few other packages that many would find useful for better
coding: devtools, testthat, lintr, codetools, svTools, rbenchmark,
pkgutils, etc.

This might be some overlap with the HPC task view. I would think that
rJava, Rcpp and the like are better suited there but this is arguable.

The last time I proposed something like this, Martin deftly convinced
me to be the maintainer. It is probably better for everyone if we
avoid that on this occasion.

* Does anyone else see the need for this?

* What other packages fit into this bin?

* Would anyone like to volunteer?




Max, thanks for the suggestion. We had a somewhat related proposal on
R-help
from Luca Braglia a couple of months ago, suggesting a "Package
Development"
task view:
https://mailman.stat.ethz.ch/pipermail/r-devel/2014-July/069454.html

He put up some ideas on Github:
https://github.com/lbraglia/PackageDevelopmentTaskView

When Luca asked me (ctv maintainer) and Dirk (HPC task view maintainer)
for
feedback off-list, I replied that it is important that task views are
focused in order to be useful and maintainable. My feeling was that
"PackageDevelopment" was too broad and also "ProgrammingTools" is still
too
board, I think. This could mean a lot of things/tools to a lot of people.

But maybe it would be to factor out some aspect that is sharp and
clear(er)?
Or split it up into bits where there are (more or less) objectively clear
criteria for what goes in and what does not?



It's funny that you said that. As I was updating the RR CTV, it
realized what a beast it is right now. I thought about making a github
project earlier today that would have more detailed examples and
information.

I see two problems with that as the *sole* solution.

First, it is divorced from CRAN CTV and that is a place that people
know and will look. I had no idea of Luca's work for this exact
reason.

Secondly, might be intimidating for new R users who, I think, are the
targeted cohort for the CTVs.



Yes, I agree. There should (an) additional task view(s) on CRAN related to
this.


How about a relatively broad definition that is succinct in content
with a link to a github repos?



I think this doesn't fit well with the existing development model and might
require duplicating changes in the  of the task view. In order
to be easily installable I need the  in the task view on CRAN
and not just in the linked list on Github.


Many of the task views are encyclopedic and still focused. Perhaps my
issues with RR are more related to how I currently organize it. I'll
try to solve it that way.


Therefore, I would suggest splitting up the topic into things that are
fairly sharp and clear. (Of course, it is impossible to avoid overlap
completely.) For example, one could add "LanguageInterfaces" or something
like that.


Looking at Luca's page, I think he does a great job of clustering
packages. My suggestions for focused topics are:

- Package Development*
- Foreign Languages Interfaces
- Code Analysis and Debugging
- Profiling and Benchmarking
- Unit Testing


Yes, good suggestions. Now we only need willing maintainers :-)

* I would define the first one to be more narrow than the original 
definition.


It's probably still the fuzziest one in the list above.


I think that most of these would encompass less than 10 packages if we
don't include all the Rcpp depends =]


:-)


And the task views on CRAN can always include  to further
documentation on Github and elsewhere. Especially when it comes to package
development there are also clearly different preferences about what is good
style or the right tools (say Github vs. R-Forge, knitr vs. Sweave, etc.)


Yes. The comments above would not exclude this approach, which
is/was/might be my intention for RR.


True.

thx,
Z

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] :: and ::: as .Primitives?

2015-01-22 Thread Michael Lawrence
On Thu, Jan 22, 2015 at 11:44 AM,   wrote:
>
> For default methods there ought to be a way to create those so the
> default method is computed at creation or load time and stored in an
> environment.

We had considered that, but we thought the definition of the function
would be easier to interpret if it explicitly specified the namespace,
instead of using tricks with environments. The same applies for
memoizing the lookup in front of a loop.

The implementation of these functions is almost simpler in C than it
is in R, so there is relatively little risk to this change. But I
agree the benefits are also somewhat minor.

> For other cases if I want to use foo::bar many times, say
> in a loop, I would do
>
> foo_bar <- foo::bar
>
> and use foo_bar, or something along those lines.
>
> When :: and ::: were introduce they were intended primarily for
> reflection and debugging, so speed was not an issue. ::: is still
> really only reliably usable that way, and making it faster may just
> encourage bad practice. :: is different and there are good arguments
> for using it in code, but I'm not yet seeing good arguments for use in
> ways that would be performance-critical, but I'm happy to be convinced
> otherwise. If there is a need for a faster :: then going to a
> SPECIALSXP is fine; it would also be good to make the byte code
> compiler aware of it, and possibly to work on ways to improve the
> performance further e.g. through cacheing.
>
> Best,
>
> luke
>
>
> On Thu, 22 Jan 2015, Peter Haverty wrote:
>
>
>> Hi all,
>>
>> When S4 methods are defined on base function (say, "match"), the
>> function becomes a method with the body "base::match(x,y)". A call to
>> such a function often spends more time doing "::" than in the function
>> itself.  I always assumed that "::" was a very low-level thing, but it
>> turns out to be a plain old function defined in base/R/namespace.R.
>> What would you all think about making "::" and ":::" .Primitives?  I
>> have submitted some examples, timings, and a patch to the R bug
>> tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134).
>> I'd be very interested to hear your thoughts on the matter.
>>
>> Regards,
>> Pete
>>
>> 
>> Peter M. Haverty, Ph.D.
>> Genentech, Inc.
>> phave...@gene.com
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
>Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Programming Tools CTV

2015-01-22 Thread Max Kuhn
On Thu, Jan 22, 2015 at 1:05 PM, Achim Zeileis  wrote:
> On Thu, 22 Jan 2015, Max Kuhn wrote:
>
>> On Thu, Jan 22, 2015 at 12:45 PM, Achim Zeileis
>>  wrote:
>>>
>>> On Thu, 22 Jan 2015, Max Kuhn wrote:
>>>
 I've had a lot of requests for additions to the reproducible research
 task view that fall into a grey area (to me at least).

 For example, roxygen2 is a tool that broadly enable reproducibility
 but I see it more as a tool for better programming. I'm about to check
 in a new version of the task view that includes packrat and
 checkpoint, as they seem closer to reproducible research, but also
 feel like coding tools.

 There are a few other packages that many would find useful for better
 coding: devtools, testthat, lintr, codetools, svTools, rbenchmark,
 pkgutils, etc.

 This might be some overlap with the HPC task view. I would think that
 rJava, Rcpp and the like are better suited there but this is arguable.

 The last time I proposed something like this, Martin deftly convinced
 me to be the maintainer. It is probably better for everyone if we
 avoid that on this occasion.

 * Does anyone else see the need for this?

 * What other packages fit into this bin?

 * Would anyone like to volunteer?
>>>
>>>
>>>
>>> Max, thanks for the suggestion. We had a somewhat related proposal on
>>> R-help
>>> from Luca Braglia a couple of months ago, suggesting a "Package
>>> Development"
>>> task view:
>>> https://mailman.stat.ethz.ch/pipermail/r-devel/2014-July/069454.html
>>>
>>> He put up some ideas on Github:
>>> https://github.com/lbraglia/PackageDevelopmentTaskView
>>>
>>> When Luca asked me (ctv maintainer) and Dirk (HPC task view maintainer)
>>> for
>>> feedback off-list, I replied that it is important that task views are
>>> focused in order to be useful and maintainable. My feeling was that
>>> "PackageDevelopment" was too broad and also "ProgrammingTools" is still
>>> too
>>> board, I think. This could mean a lot of things/tools to a lot of people.
>>>
>>> But maybe it would be to factor out some aspect that is sharp and
>>> clear(er)?
>>> Or split it up into bits where there are (more or less) objectively clear
>>> criteria for what goes in and what does not?
>>
>>
>> It's funny that you said that. As I was updating the RR CTV, it
>> realized what a beast it is right now. I thought about making a github
>> project earlier today that would have more detailed examples and
>> information.
>>
>> I see two problems with that as the *sole* solution.
>>
>> First, it is divorced from CRAN CTV and that is a place that people
>> know and will look. I had no idea of Luca's work for this exact
>> reason.
>>
>> Secondly, might be intimidating for new R users who, I think, are the
>> targeted cohort for the CTVs.
>
>
> Yes, I agree. There should (an) additional task view(s) on CRAN related to
> this.
>
>> How about a relatively broad definition that is succinct in content
>> with a link to a github repos?
>
>
> I think this doesn't fit well with the existing development model and might
> require duplicating changes in the  of the task view. In order
> to be easily installable I need the  in the task view on CRAN
> and not just in the linked list on Github.

Many of the task views are encyclopedic and still focused. Perhaps my
issues with RR are more related to how I currently organize it. I'll
try to solve it that way.

> Therefore, I would suggest splitting up the topic into things that are
> fairly sharp and clear. (Of course, it is impossible to avoid overlap
> completely.) For example, one could add "LanguageInterfaces" or something
> like that.

Looking at Luca's page, I think he does a great job of clustering
packages. My suggestions for focused topics are:

- Package Development*
- Foreign Languages Interfaces
- Code Analysis and Debugging
- Profiling and Benchmarking
- Unit Testing

* I would define the first one to be more narrow than the original definition.

I think that most of these would encompass less than 10 packages if we
don't include all the Rcpp depends =]

> And the task views on CRAN can always include  to further
> documentation on Github and elsewhere. Especially when it comes to package
> development there are also clearly different preferences about what is good
> style or the right tools (say Github vs. R-Forge, knitr vs. Sweave, etc.)

Yes. The comments above would not exclude this approach, which
is/was/might be my intention for RR.

Thanks,

Max

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] :: and ::: as .Primitives?

2015-01-22 Thread Henrik Bengtsson
On Thu, Jan 22, 2015 at 11:44 AM,   wrote:
> I'm not convinced that how to make :: faster is the right question. If
> you are finding foo::bar being called often enough to matter to your
> overall performance then to me the question is: why are you calling
> foo::bar more than once? Making :: a bit faster by making it a
> primitive will remove some overhead, but your are still left with a
> lot of work that shouldn't need to happen more than once.
>
> For default methods there ought to be a way to create those so the
> default method is computed at creation or load time and stored in an
> environment. For other cases if I want to use foo::bar many times, say
> in a loop, I would do
>
> foo_bar <- foo::bar
>
> and use foo_bar, or something along those lines.

While you're on the line: Do you think this is an optimization that
the 'compiler' package and it's cmpfun() byte compiler will be able to
do in the future?

/Henrik

>
> When :: and ::: were introduce they were intended primarily for
> reflection and debugging, so speed was not an issue. ::: is still
> really only reliably usable that way, and making it faster may just
> encourage bad practice. :: is different and there are good arguments
> for using it in code, but I'm not yet seeing good arguments for use in
> ways that would be performance-critical, but I'm happy to be convinced
> otherwise. If there is a need for a faster :: then going to a
> SPECIALSXP is fine; it would also be good to make the byte code
> compiler aware of it, and possibly to work on ways to improve the
> performance further e.g. through cacheing.
>
> Best,
>
> luke
>
>
> On Thu, 22 Jan 2015, Peter Haverty wrote:
>
>
>> Hi all,
>>
>> When S4 methods are defined on base function (say, "match"), the
>> function becomes a method with the body "base::match(x,y)". A call to
>> such a function often spends more time doing "::" than in the function
>> itself.  I always assumed that "::" was a very low-level thing, but it
>> turns out to be a plain old function defined in base/R/namespace.R.
>> What would you all think about making "::" and ":::" .Primitives?  I
>> have submitted some examples, timings, and a patch to the R bug
>> tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134).
>> I'd be very interested to hear your thoughts on the matter.
>>
>> Regards,
>> Pete
>>
>> 
>> Peter M. Haverty, Ph.D.
>> Genentech, Inc.
>> phave...@gene.com
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
>Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] :: and ::: as .Primitives?

2015-01-22 Thread Tim Keitt
On Thu, Jan 22, 2015 at 1:44 PM,  wrote:

> I'm not convinced that how to make :: faster is the right question. If
> you are finding foo::bar being called often enough to matter to your
> overall performance then to me the question is: why are you calling
> foo::bar more than once? Making :: a bit faster by making it a
> primitive will remove some overhead, but your are still left with a
> lot of work that shouldn't need to happen more than once.
>
> For default methods there ought to be a way to create those so the
> default method is computed at creation or load time and stored in an
> environment. For other cases if I want to use foo::bar many times, say
> in a loop, I would do
>
> foo_bar <- foo::bar
>
> and use foo_bar, or something along those lines.
>
> When :: and ::: were introduce they were intended primarily for
> reflection and debugging, so speed was not an issue. ::: is still
> really only reliably usable that way, and making it faster may just
> encourage bad practice. :: is different and there are good arguments
> for using it in code, but I'm not yet seeing good arguments for use in
> ways that would be performance-critical, but I'm happy to be convinced
> otherwise. If there is a need for a faster :: then going to a
> SPECIALSXP is fine; it would also be good to make the byte code
> compiler aware of it, and possibly to work on ways to improve the
> performance further e.g. through cacheing.
>

I think you will find that no matter how much it does not matter in terms
of performance, folks will avoid :: out of principle if they think its
slower. We're conditioned to write efficient code even when it does not
really impact real world usage. As using :: is good practice in many
contexts, making it fast will encourage folks to use it.

THK


>
> Best,
>
> luke
>
>
> On Thu, 22 Jan 2015, Peter Haverty wrote:
>
>
>  Hi all,
>>
>> When S4 methods are defined on base function (say, "match"), the
>> function becomes a method with the body "base::match(x,y)". A call to
>> such a function often spends more time doing "::" than in the function
>> itself.  I always assumed that "::" was a very low-level thing, but it
>> turns out to be a plain old function defined in base/R/namespace.R.
>> What would you all think about making "::" and ":::" .Primitives?  I
>> have submitted some examples, timings, and a patch to the R bug
>> tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134).
>> I'd be very interested to hear your thoughts on the matter.
>>
>> Regards,
>> Pete
>>
>> 
>> Peter M. Haverty, Ph.D.
>> Genentech, Inc.
>> phave...@gene.com
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
>Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
http://www.keittlab.org/

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] :: and ::: as .Primitives?

2015-01-22 Thread luke-tierney

I'm not convinced that how to make :: faster is the right question. If
you are finding foo::bar being called often enough to matter to your
overall performance then to me the question is: why are you calling
foo::bar more than once? Making :: a bit faster by making it a
primitive will remove some overhead, but your are still left with a
lot of work that shouldn't need to happen more than once.

For default methods there ought to be a way to create those so the
default method is computed at creation or load time and stored in an
environment. For other cases if I want to use foo::bar many times, say
in a loop, I would do

foo_bar <- foo::bar

and use foo_bar, or something along those lines.

When :: and ::: were introduce they were intended primarily for
reflection and debugging, so speed was not an issue. ::: is still
really only reliably usable that way, and making it faster may just
encourage bad practice. :: is different and there are good arguments
for using it in code, but I'm not yet seeing good arguments for use in
ways that would be performance-critical, but I'm happy to be convinced
otherwise. If there is a need for a faster :: then going to a
SPECIALSXP is fine; it would also be good to make the byte code
compiler aware of it, and possibly to work on ways to improve the
performance further e.g. through cacheing.

Best,

luke

On Thu, 22 Jan 2015, Peter Haverty wrote:



Hi all,

When S4 methods are defined on base function (say, "match"), the
function becomes a method with the body "base::match(x,y)". A call to
such a function often spends more time doing "::" than in the function
itself.  I always assumed that "::" was a very low-level thing, but it
turns out to be a plain old function defined in base/R/namespace.R.
What would you all think about making "::" and ":::" .Primitives?  I
have submitted some examples, timings, and a patch to the R bug
tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134).
I'd be very interested to hear your thoughts on the matter.

Regards,
Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] :: and ::: as .Primitives?

2015-01-22 Thread Peter Haverty
Hi all,

When S4 methods are defined on base function (say, "match"), the
function becomes a method with the body "base::match(x,y)". A call to
such a function often spends more time doing "::" than in the function
itself.  I always assumed that "::" was a very low-level thing, but it
turns out to be a plain old function defined in base/R/namespace.R.
What would you all think about making "::" and ":::" .Primitives?  I
have submitted some examples, timings, and a patch to the R bug
tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134).
I'd be very interested to hear your thoughts on the matter.

Regards,
Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] speedbump in library

2015-01-22 Thread Peter Haverty
Hi all,

Profiling turned up a bit of a speedbump in the library function. I
submitted a patch to the R bug tracker as bug 16168 and I've also
included it below. The alternate code is simpler and easier to
read/maintain, I believe.  Any thoughts on other ways to write this?

Index: src/library/base/R/library.R
===
--- src/library/base/R/library.R(revision 67578)
+++ src/library/base/R/library.R(working copy)
@@ -688,18 +688,8 @@
 out <- character()

 for(pkg in package) {
-paths <- character()
-for(lib in lib.loc) {
-dirs <- list.files(lib,
-   pattern = paste0("^", pkg, "$"),
-   full.names = TRUE)
-## Note that we cannot use tools::file_test() here, as
-## cyclic namespace dependencies are not supported.  Argh.
-paths <- c(paths,
-   dirs[dir.exists(dirs) &
-file.exists(file.path(dirs,
-  "DESCRIPTION"))])
-}
+paths <- file.path(lib.loc, pkg)
+paths <- paths[ file.exists(file.path(paths, "DESCRIPTION")) ]
 if(use_loaded && pkg %in% loadedNamespaces()) {
 dir <- if (pkg == "base") system.file()
 else getNamespaceInfo(pkg, "path")

Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Programming Tools CTV

2015-01-22 Thread Achim Zeileis

On Thu, 22 Jan 2015, Max Kuhn wrote:


On Thu, Jan 22, 2015 at 12:45 PM, Achim Zeileis
 wrote:

On Thu, 22 Jan 2015, Max Kuhn wrote:


I've had a lot of requests for additions to the reproducible research
task view that fall into a grey area (to me at least).

For example, roxygen2 is a tool that broadly enable reproducibility
but I see it more as a tool for better programming. I'm about to check
in a new version of the task view that includes packrat and
checkpoint, as they seem closer to reproducible research, but also
feel like coding tools.

There are a few other packages that many would find useful for better
coding: devtools, testthat, lintr, codetools, svTools, rbenchmark,
pkgutils, etc.

This might be some overlap with the HPC task view. I would think that
rJava, Rcpp and the like are better suited there but this is arguable.

The last time I proposed something like this, Martin deftly convinced
me to be the maintainer. It is probably better for everyone if we
avoid that on this occasion.

* Does anyone else see the need for this?

* What other packages fit into this bin?

* Would anyone like to volunteer?



Max, thanks for the suggestion. We had a somewhat related proposal on R-help
from Luca Braglia a couple of months ago, suggesting a "Package Development"
task view:
https://mailman.stat.ethz.ch/pipermail/r-devel/2014-July/069454.html

He put up some ideas on Github:
https://github.com/lbraglia/PackageDevelopmentTaskView

When Luca asked me (ctv maintainer) and Dirk (HPC task view maintainer) for
feedback off-list, I replied that it is important that task views are
focused in order to be useful and maintainable. My feeling was that
"PackageDevelopment" was too broad and also "ProgrammingTools" is still too
board, I think. This could mean a lot of things/tools to a lot of people.

But maybe it would be to factor out some aspect that is sharp and clear(er)?
Or split it up into bits where there are (more or less) objectively clear
criteria for what goes in and what does not?


It's funny that you said that. As I was updating the RR CTV, it
realized what a beast it is right now. I thought about making a github
project earlier today that would have more detailed examples and
information.

I see two problems with that as the *sole* solution.

First, it is divorced from CRAN CTV and that is a place that people
know and will look. I had no idea of Luca's work for this exact
reason.

Secondly, might be intimidating for new R users who, I think, are the
targeted cohort for the CTVs.


Yes, I agree. There should (an) additional task view(s) on CRAN related to 
this.



How about a relatively broad definition that is succinct in content
with a link to a github repos?


I think this doesn't fit well with the existing development model and 
might require duplicating changes in the  of the task view. 
In order to be easily installable I need the  in the task 
view on CRAN and not just in the linked list on Github.


Therefore, I would suggest splitting up the topic into things that are 
fairly sharp and clear. (Of course, it is impossible to avoid overlap 
completely.) For example, one could add "LanguageInterfaces" or something 
like that.


And the task views on CRAN can always include  to further 
documentation on Github and elsewhere. Especially when it comes to package 
development there are also clearly different preferences about what is 
good style or the right tools (say Github vs. R-Forge, knitr vs. Sweave, 
etc.)



Thanks,

Max



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Programming Tools CTV

2015-01-22 Thread Max Kuhn
On Thu, Jan 22, 2015 at 12:45 PM, Achim Zeileis
 wrote:
> On Thu, 22 Jan 2015, Max Kuhn wrote:
>
>> I've had a lot of requests for additions to the reproducible research
>> task view that fall into a grey area (to me at least).
>>
>> For example, roxygen2 is a tool that broadly enable reproducibility
>> but I see it more as a tool for better programming. I'm about to check
>> in a new version of the task view that includes packrat and
>> checkpoint, as they seem closer to reproducible research, but also
>> feel like coding tools.
>>
>> There are a few other packages that many would find useful for better
>> coding: devtools, testthat, lintr, codetools, svTools, rbenchmark,
>> pkgutils, etc.
>>
>> This might be some overlap with the HPC task view. I would think that
>> rJava, Rcpp and the like are better suited there but this is arguable.
>>
>> The last time I proposed something like this, Martin deftly convinced
>> me to be the maintainer. It is probably better for everyone if we
>> avoid that on this occasion.
>>
>> * Does anyone else see the need for this?
>>
>> * What other packages fit into this bin?
>>
>> * Would anyone like to volunteer?
>
>
> Max, thanks for the suggestion. We had a somewhat related proposal on R-help
> from Luca Braglia a couple of months ago, suggesting a "Package Development"
> task view:
> https://mailman.stat.ethz.ch/pipermail/r-devel/2014-July/069454.html
>
> He put up some ideas on Github:
> https://github.com/lbraglia/PackageDevelopmentTaskView
>
> When Luca asked me (ctv maintainer) and Dirk (HPC task view maintainer) for
> feedback off-list, I replied that it is important that task views are
> focused in order to be useful and maintainable. My feeling was that
> "PackageDevelopment" was too broad and also "ProgrammingTools" is still too
> board, I think. This could mean a lot of things/tools to a lot of people.
>
> But maybe it would be to factor out some aspect that is sharp and clear(er)?
> Or split it up into bits where there are (more or less) objectively clear
> criteria for what goes in and what does not?

It's funny that you said that. As I was updating the RR CTV, it
realized what a beast it is right now. I thought about making a github
project earlier today that would have more detailed examples and
information.

I see two problems with that as the *sole* solution.

First, it is divorced from CRAN CTV and that is a place that people
know and will look. I had no idea of Luca's work for this exact
reason.

Secondly, might be intimidating for new R users who, I think, are the
targeted cohort for the CTVs.

How about a relatively broad definition that is succinct in content
with a link to a github repos?

Thanks,

Max

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Programming Tools CTV

2015-01-22 Thread Achim Zeileis

On Thu, 22 Jan 2015, Max Kuhn wrote:


I've had a lot of requests for additions to the reproducible research
task view that fall into a grey area (to me at least).

For example, roxygen2 is a tool that broadly enable reproducibility
but I see it more as a tool for better programming. I'm about to check
in a new version of the task view that includes packrat and
checkpoint, as they seem closer to reproducible research, but also
feel like coding tools.

There are a few other packages that many would find useful for better
coding: devtools, testthat, lintr, codetools, svTools, rbenchmark,
pkgutils, etc.

This might be some overlap with the HPC task view. I would think that
rJava, Rcpp and the like are better suited there but this is arguable.

The last time I proposed something like this, Martin deftly convinced
me to be the maintainer. It is probably better for everyone if we
avoid that on this occasion.

* Does anyone else see the need for this?

* What other packages fit into this bin?

* Would anyone like to volunteer?


Max, thanks for the suggestion. We had a somewhat related proposal on 
R-help from Luca Braglia a couple of months ago, suggesting a "Package 
Development" task view: 
https://mailman.stat.ethz.ch/pipermail/r-devel/2014-July/069454.html


He put up some ideas on Github:
https://github.com/lbraglia/PackageDevelopmentTaskView

When Luca asked me (ctv maintainer) and Dirk (HPC task view maintainer) 
for feedback off-list, I replied that it is important that task views are 
focused in order to be useful and maintainable. My feeling was that 
"PackageDevelopment" was too broad and also "ProgrammingTools" is still 
too board, I think. This could mean a lot of things/tools to a lot of 
people.


But maybe it would be to factor out some aspect that is sharp and 
clear(er)? Or split it up into bits where there are (more or less) 
objectively clear criteria for what goes in and what does not?


Best,
Z


Thanks,

Max

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Programming Tools CTV

2015-01-22 Thread Luca Braglia
Hi,

this summer, after few mails on this list, i started something similar
(feeling the same need)... here is the repo

https://github.com/lbraglia/PackageDevelopmentTaskView

Currently it's quite freezed since i'm working on other projects in my
free software spare time (and likely i won't return to it) but could
be a starting point for someone else interested.


Best, Luca

PS in the case, following some mails with Dirk and Achim, HPC stuff
a-la Rcpp and friends should not be copied from Dirk's stuff, better
pointing... it was in my mental TODO

2015-01-22 18:23 GMT+01:00 Gregory R. Warnes :
> I second the motion for a Programming Tools CRAN Task View.
>
> I would also think it could contain things like Rcpp, R6, etc.
>
> -Greg
>
>
>> On Jan 22, 2015, at 10:20 AM, Max Kuhn  wrote:
>>
>> I've had a lot of requests for additions to the reproducible research
>> task view that fall into a grey area (to me at least).
>>
>> For example, roxygen2 is a tool that broadly enable reproducibility
>> but I see it more as a tool for better programming. I'm about to check
>> in a new version of the task view that includes packrat and
>> checkpoint, as they seem closer to reproducible research, but also
>> feel like coding tools.
>>
>> There are a few other packages that many would find useful for better
>> coding: devtools, testthat, lintr, codetools, svTools, rbenchmark,
>> pkgutils, etc.
>>
>> This might be some overlap with the HPC task view. I would think that
>> rJava, Rcpp and the like are better suited there but this is arguable.
>>
>> The last time I proposed something like this, Martin deftly convinced
>> me to be the maintainer. It is probably better for everyone if we
>> avoid that on this occasion.
>>
>> * Does anyone else see the need for this?
>>
>> * What other packages fit into this bin?
>>
>> * Would anyone like to volunteer?
>>
>> Thanks,
>>
>> Max
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Programming Tools CTV

2015-01-22 Thread Henrik Bengtsson
On Thu, Jan 22, 2015 at 7:20 AM, Max Kuhn  wrote:
> I've had a lot of requests for additions to the reproducible research
> task view that fall into a grey area (to me at least).
>
> For example, roxygen2 is a tool that broadly enable reproducibility
> but I see it more as a tool for better programming. I'm about to check
> in a new version of the task view that includes packrat and
> checkpoint, as they seem closer to reproducible research, but also
> feel like coding tools.
>
> There are a few other packages that many would find useful for better
> coding: devtools, testthat, lintr, codetools, svTools, rbenchmark,
> pkgutils, etc.
>
> This might be some overlap with the HPC task view. I would think that
> rJava, Rcpp and the like are better suited there but this is arguable.
>
> The last time I proposed something like this, Martin deftly convinced
> me to be the maintainer. It is probably better for everyone if we
> avoid that on this occasion.
>
> * Does anyone else see the need for this?
>
> * What other packages fit into this bin?
>
> * Would anyone like to volunteer?

Thanks for your work on this.

May I suggest a Git/GitHub repository for this?  That lowers the
barriers for contributions substantially, e.g. either via issues but
even better via pull requests (== point'n'click for you).  If you need
to mirror/push it to an SVN repository, I'm sure that's pretty easy to
do (and likely also to automate).

/Henrik

PS. Sorry, I'm not volunteering; too much on my plate.

>
> Thanks,
>
> Max
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Programming Tools CTV

2015-01-22 Thread Gregory R. Warnes
I second the motion for a Programming Tools CRAN Task View.

I would also think it could contain things like Rcpp, R6, etc. 

-Greg


> On Jan 22, 2015, at 10:20 AM, Max Kuhn  wrote:
> 
> I've had a lot of requests for additions to the reproducible research
> task view that fall into a grey area (to me at least).
> 
> For example, roxygen2 is a tool that broadly enable reproducibility
> but I see it more as a tool for better programming. I'm about to check
> in a new version of the task view that includes packrat and
> checkpoint, as they seem closer to reproducible research, but also
> feel like coding tools.
> 
> There are a few other packages that many would find useful for better
> coding: devtools, testthat, lintr, codetools, svTools, rbenchmark,
> pkgutils, etc.
> 
> This might be some overlap with the HPC task view. I would think that
> rJava, Rcpp and the like are better suited there but this is arguable.
> 
> The last time I proposed something like this, Martin deftly convinced
> me to be the maintainer. It is probably better for everyone if we
> avoid that on this occasion.
> 
> * Does anyone else see the need for this?
> 
> * What other packages fit into this bin?
> 
> * Would anyone like to volunteer?
> 
> Thanks,
> 
> Max
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R CMD check: Locale not set to C?

2015-01-22 Thread Tobias Setz
Dear All

The "R CMD check" on the "zoo" (1.7-11) package results in an error on my
environment. It can be reduced to the following example:


> require(zoo)
> read.zoo(system.file("doc", "demo1.txt", package = "zoo"), sep = "|",
format="%d %b %Y")

Error in read.zoo(system.file("doc", "demo1.txt", package = "zoo"), sep =
"|",  :
  index has bad entries at data rows: 14 15 16 17 18 19 20


I am using the following environment (on Windows 7):


> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Switzerland.1252  LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


The problem are the locale settings. In the "demo1.txt" the months are
abbreviated in English; while my environment would only accept German
abbreviations. The problem can be solved by setting the time locale:

> Sys.setlocale("LC_TIME", "English")
or
> Sys.setlocale("LC_TIME", "C")


Now; for "R CMD check", the manual
(http://cran.r-project.org/doc/manuals/r-release/R-exts.html) states the
following:

- "R CMD check and R CMD build run R processes with --vanilla..."
So no possibility to set the locales (in contrary to the environment
variables) through an "Rprofile" file...

- "All these tests are run with collation set to the C locale..."
If I set "LC_ALL" or only "LC_TIME" to "C" the example shown at the top
actually works if I run it manually.


However; if I run "R CMD check" I get the ERROR.
Therefore; are the locales really set to "C" for "R CMD check"?
If yes; why would the example above not work?
If no; how could I achieve custom locale settings?

Thanks!
Tobias



--
Tobias Setz
 
Rmetrics Association
tobias.s...@rmetrics.org
www.rmetrics.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Programming Tools CTV

2015-01-22 Thread Max Kuhn
I've had a lot of requests for additions to the reproducible research
task view that fall into a grey area (to me at least).

For example, roxygen2 is a tool that broadly enable reproducibility
but I see it more as a tool for better programming. I'm about to check
in a new version of the task view that includes packrat and
checkpoint, as they seem closer to reproducible research, but also
feel like coding tools.

There are a few other packages that many would find useful for better
coding: devtools, testthat, lintr, codetools, svTools, rbenchmark,
pkgutils, etc.

This might be some overlap with the HPC task view. I would think that
rJava, Rcpp and the like are better suited there but this is arguable.

The last time I proposed something like this, Martin deftly convinced
me to be the maintainer. It is probably better for everyone if we
avoid that on this occasion.

* Does anyone else see the need for this?

* What other packages fit into this bin?

* Would anyone like to volunteer?

Thanks,

Max

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] reducing redundant work in methods package

2015-01-22 Thread Michael Lawrence
Actually, after reading the comment about it being OK for it being
NULL, it's not a bug after all.

On Thu, Jan 22, 2015 at 5:57 AM, Michael Lawrence  wrote:
> I also just noticed that there is a bug: identical(ans, FALSE) should
> be is.null(ans).
>
> So no error is thrown:
>> methods:::genericForPrimitive("foo")
> NULL
>
> Will fix.
>
> On Wed, Jan 21, 2015 at 3:13 PM, Peter Haverty  wrote:
>> Doing it like this:
>>
>> genericForPrimitive <- function(f, where = topenv(parent.frame()), mustFind
>> = TRUE) {
>>
>> ans = .BasicFunsList[[f]]
>>
>> ## this element may not exist (yet, during loading), dom't test null
>>
>> if(mustFind && identical(ans, FALSE))
>>
>> stop(gettextf("methods may not be defined for primitive function %s
>> in this version of R",
>>
>>   sQuote(f)),
>>
>>  domain = NA)
>>
>> ans
>>
>> }
>>
>> or this:
>>
>> genericForPrimitive <- function(f, where = topenv(parent.frame()), mustFind
>> = TRUE) {
>>
>> env = asNamespace("methods")
>>
>> funs <- env[[".BasicFunsList"]]
>>
>> ans = funs[[f]]
>>
>> ## this element may not exist (yet, during loading), dom't test null
>>
>> if(mustFind && identical(ans, FALSE))
>>
>> stop(gettextf("methods may not be defined for primitive function %s
>> in this version of R",
>>
>>   sQuote(f)),
>>
>>  domain = NA)
>>
>> ans
>>
>> }
>>
>> Seems to work just fine.
>>
>> Yes, "el" and "elNamed" can probably go now.
>>
>>
>> Pete
>>
>> 
>> Peter M. Haverty, Ph.D.
>> Genentech, Inc.
>> phave...@gene.com
>>
>> On Wed, Jan 21, 2015 at 2:26 PM, Michael Lawrence
>>  wrote:
>>>
>>> Note that setMethod() resolves .BasicFunsList in the methods namespace
>>> directly when setting a method on a primitive. Somehow there should be
>>> consistency between genericForPrimitive() and the check in setMethod().
>>>
>>> Also, we can probably step away from the use of elNamed(), given that [[
>>> now uses exact matching.
>>>
>>> Have you tried patching methods to use .BasicFunsList directly as in
>>> setMethod?
>>>
>>>
>>> On Wed, Jan 21, 2015 at 10:41 AM, Peter Haverty 
>>> wrote:

 Hi all,

 The function call series genericForPrimitive -> .findBasicFuns ->
 .findAll
 happens 4400 times while the GenomicRanges package is loading.  Each time
 .findAll follows a chain of environments to determine that the methods
 namespace is the only one that holds a variable called .BasicFunsList.
 This
 accounts for ~10% of package loading time. I'm sure there is some history
 to that design, but would it be possible shortcut this operation? Could
 .BasicFunsList be initialized in the methods namespace at startup and
 might
 genericForPrimitive just go straight there?

 Does anyone on the list know why it works this way?

 There are some other cases of seemingly redundant work, but this seems
 like
 an easy one to address.

 I have included some code below that was used to investigate some of the
 above.

 # Try this to count calls to a function

 .count <-  0; trace(methods:::.findBasicFuns,tracer=function() { .count
 <<-
 .count + 1 }); library(GenomicRanges); print(.count)

 # Try this to capture the input and output of a set of functions you wish
 to refactor

 .init_test_data_collection <- function(ns = asNamespace("methods")) {

 funs = c("isClassUnion", "getClass", "genericForPrimitive",
 "possibleExtends", ".dataSlot", ".requirePackage", ".classEnv",
 "getClassDef", "outerLabels", ".getClassFromCache", "getFunction")

 message(paste0("\nCollecting data for unit tests on ", paste(funs,
 collapse=", "), " ...\n"))

 # Make env with list to hold test input/output

 TEST_ENV <- new.env()

 for (fname in funs) {

 # Make placeholder for input/output for future runs of this
 function

 TEST_ENV[[fname]] = list()  # Actually probably not necessary,
 will
 just be c(NULL, list(first result)) the first time

 # Construct test version of function

 unlockBinding(fname, ns)

 fun = get(fname, envir=ns, mode="function")

 funbody = deparse(body(fun))

 newfun <- fun

 newfun.body = c(

 sprintf("fname = '%s'", fname),

 "TEST_INFO = list()",

 "TEST_INFO$input = mget(names(formals(fname)))",

 c("realfun <- function()", funbody),

 "TEST_INFO$output = realfun()",

 "TEST_ENV[[fname]] = c(TEST_ENV[[fname]], list(TEST_INFO))",

 "return(TEST_INFO$output)")

 body(newfun) = as.call(c(as.name("{"),
 as.list(parse(text=newfun.body

 assign(fname

Re: [Rd] reducing redundant work in methods package

2015-01-22 Thread Michael Lawrence
I also just noticed that there is a bug: identical(ans, FALSE) should
be is.null(ans).

So no error is thrown:
> methods:::genericForPrimitive("foo")
NULL

Will fix.

On Wed, Jan 21, 2015 at 3:13 PM, Peter Haverty  wrote:
> Doing it like this:
>
> genericForPrimitive <- function(f, where = topenv(parent.frame()), mustFind
> = TRUE) {
>
> ans = .BasicFunsList[[f]]
>
> ## this element may not exist (yet, during loading), dom't test null
>
> if(mustFind && identical(ans, FALSE))
>
> stop(gettextf("methods may not be defined for primitive function %s
> in this version of R",
>
>   sQuote(f)),
>
>  domain = NA)
>
> ans
>
> }
>
> or this:
>
> genericForPrimitive <- function(f, where = topenv(parent.frame()), mustFind
> = TRUE) {
>
> env = asNamespace("methods")
>
> funs <- env[[".BasicFunsList"]]
>
> ans = funs[[f]]
>
> ## this element may not exist (yet, during loading), dom't test null
>
> if(mustFind && identical(ans, FALSE))
>
> stop(gettextf("methods may not be defined for primitive function %s
> in this version of R",
>
>   sQuote(f)),
>
>  domain = NA)
>
> ans
>
> }
>
> Seems to work just fine.
>
> Yes, "el" and "elNamed" can probably go now.
>
>
> Pete
>
> 
> Peter M. Haverty, Ph.D.
> Genentech, Inc.
> phave...@gene.com
>
> On Wed, Jan 21, 2015 at 2:26 PM, Michael Lawrence
>  wrote:
>>
>> Note that setMethod() resolves .BasicFunsList in the methods namespace
>> directly when setting a method on a primitive. Somehow there should be
>> consistency between genericForPrimitive() and the check in setMethod().
>>
>> Also, we can probably step away from the use of elNamed(), given that [[
>> now uses exact matching.
>>
>> Have you tried patching methods to use .BasicFunsList directly as in
>> setMethod?
>>
>>
>> On Wed, Jan 21, 2015 at 10:41 AM, Peter Haverty 
>> wrote:
>>>
>>> Hi all,
>>>
>>> The function call series genericForPrimitive -> .findBasicFuns ->
>>> .findAll
>>> happens 4400 times while the GenomicRanges package is loading.  Each time
>>> .findAll follows a chain of environments to determine that the methods
>>> namespace is the only one that holds a variable called .BasicFunsList.
>>> This
>>> accounts for ~10% of package loading time. I'm sure there is some history
>>> to that design, but would it be possible shortcut this operation? Could
>>> .BasicFunsList be initialized in the methods namespace at startup and
>>> might
>>> genericForPrimitive just go straight there?
>>>
>>> Does anyone on the list know why it works this way?
>>>
>>> There are some other cases of seemingly redundant work, but this seems
>>> like
>>> an easy one to address.
>>>
>>> I have included some code below that was used to investigate some of the
>>> above.
>>>
>>> # Try this to count calls to a function
>>>
>>> .count <-  0; trace(methods:::.findBasicFuns,tracer=function() { .count
>>> <<-
>>> .count + 1 }); library(GenomicRanges); print(.count)
>>>
>>> # Try this to capture the input and output of a set of functions you wish
>>> to refactor
>>>
>>> .init_test_data_collection <- function(ns = asNamespace("methods")) {
>>>
>>> funs = c("isClassUnion", "getClass", "genericForPrimitive",
>>> "possibleExtends", ".dataSlot", ".requirePackage", ".classEnv",
>>> "getClassDef", "outerLabels", ".getClassFromCache", "getFunction")
>>>
>>> message(paste0("\nCollecting data for unit tests on ", paste(funs,
>>> collapse=", "), " ...\n"))
>>>
>>> # Make env with list to hold test input/output
>>>
>>> TEST_ENV <- new.env()
>>>
>>> for (fname in funs) {
>>>
>>> # Make placeholder for input/output for future runs of this
>>> function
>>>
>>> TEST_ENV[[fname]] = list()  # Actually probably not necessary,
>>> will
>>> just be c(NULL, list(first result)) the first time
>>>
>>> # Construct test version of function
>>>
>>> unlockBinding(fname, ns)
>>>
>>> fun = get(fname, envir=ns, mode="function")
>>>
>>> funbody = deparse(body(fun))
>>>
>>> newfun <- fun
>>>
>>> newfun.body = c(
>>>
>>> sprintf("fname = '%s'", fname),
>>>
>>> "TEST_INFO = list()",
>>>
>>> "TEST_INFO$input = mget(names(formals(fname)))",
>>>
>>> c("realfun <- function()", funbody),
>>>
>>> "TEST_INFO$output = realfun()",
>>>
>>> "TEST_ENV[[fname]] = c(TEST_ENV[[fname]], list(TEST_INFO))",
>>>
>>> "return(TEST_INFO$output)")
>>>
>>> body(newfun) = as.call(c(as.name("{"),
>>> as.list(parse(text=newfun.body
>>>
>>> assign(fname, newfun, envir=ns)
>>>
>>> }
>>>
>>> return(TEST_ENV)
>>>
>>> }
>>> # run code, print items in TEST_ENV
>>>
>>> The relevant code is in methods/R/BasicFunsList.R and
>>> methods/R/ClassExtensions.R
>>> Pete
>>>
>>> 
>>> Peter M. Haverty, Ph.D.
>>> Genentech, Inc.
>>> phave...@gene.com
>>>
>>> [[al