Re: [Rd] specials and ::

2024-08-26 Thread Chris Black
It’s completely reasonable to decline to do extra work to support it, but at 
the same time: Qualified calls are widely used and recommended, and users are 
also being completely reasonable when they try to use them (probably without 
checking the manual!) and expect them to work.

Would there be a tolerably easy way to make the fit fail loudly on 
`survival::strata(…)` rather than return the wrong result?



> On Aug 26, 2024, at 7:42 AM, Therneau, Terry M., Ph.D. via R-devel 
>  wrote:
> 
> The survival package makes significant use of the "specials" argument of 
> terms(), before 
> calling model.frame; it is part of nearly every modeling function. The reason 
> is that 
> strata argments simply have to be handled differently than other things on 
> the right hand 
> side. Likewise for tt() and cluster(), though those are much less frequent.
> 
> I now get "bug reports" from the growing segment that believes one should put 
> packagename:: in front of every single instance.   For instance
>   fit <- survival::survdiff( survival::Surv(time, status) ~ ph.karno + 
> survival::strata(inst),  data= survival::lung)
> 
> This fails to give the correct answer because it fools terms(formula, 
> specials= 
> "strata").I've stood firm in my response of "that's your bug, not mine", 
> but I begin 
> to believe I am swimming uphill.   One person responded that it was company 
> policy to 
> qualify everything.
> 
> I don't see an easy way to fix survival, and even if I did it would be a 
> tremendous amout 
> of work.   What are other's thoughts?
> 
> Terry
> 
> 
> 
> -- 
> 
> Terry M Therneau, PhD
> Department of Quantitative Health Sciences
> Mayo Clinic
> thern...@mayo.edu
> 
> "TERR-ree THUR-noh"
> 
> [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [EXTERNAL] Re: NOTE: multiple local function definitions for ?fun? with different formal arguments

2024-02-07 Thread Chris Black
Hopefully to too much of a tangent: A related problem this check doesn’t catch 
is accidental top-level redefinitions in package code, such as

## a.R:
helper <- function() 1

f <- function() {
helper()
}
# “cool, f() must return 1"

## b.R:
helper <- function(x) 2

g <- function() {
helper()
}
# “cool, g() must return 2"

## Runtime: 
# > c(pkg::f(), pkg::g())
# [1] 2 2
# “oh right, only the last definition of helper() is used”
 
I’ve seen several variants of this issue in code from folks who are new to 
package development, especially if they're naively refactoring something that 
started out as an interactively-run analysis. Collaborators who are puzzled by 
it get my “packages are collections of objects not sequences of expressions, 
yes that needs to be in your mental model, here’s the link to RWE again” talk, 
but I would be happy to be able to point them to a check result to go along 
with it.

I don’t think this is grounds on its own to change a 20-year precedent, but in 
case anyone is collecting wishlist reasons to make the check look harder...

Thanks,
Chris

> On Feb 6, 2024, at 3:17 PM, Martin Morgan  wrote:
> 
> I went looking and found this in codetools, where it's been for 20 years
> 
> https://gitlab.com/luke-tierney/codetools/-/blame/master/R/codetools.R?ref_type=heads#L951
> 
> I think the call stack in codetools is checkUsagePackage -> checkUsageEnv -> 
> checkUsage, and these are similarly established. The call from the tools 
> package 
> https://github.com/wch/r-source/blame/95146f0f366a36899e4277a6a722964a51b93603/src/library/tools/R/QC.R#L4585
>  is also quite old.
> 
> I'm not sure this had been said explicitly, but perhaps the original intent 
> was to protect against accidentally redefining a local function. Obviously 
> one could do this with a local variable too, though that might less often be 
> an error…
> 
> toto <- function(mode) {
>tata <- function(a, b) a * b  # intended
>tata <- function(a, b) a / b  # oops
>…
> }
> 
> Another workaround is to actually name the local functions
> 
> toto <- function(mode) {
>tata <- function(a, b) a * b
>titi <- function(u, v, w) (u + v) / w
>if (mode == 1)
>tata
>else
>titi
> }
> 
> … or to use a switch statement
> 
> toto <- function(mode) {
>## fun <- switch(…) for use of `fun()` in toto
>switch(
>mode,
>tata = function(a, b) a * b,
>titi = function(u, v, w) (u + v) / w,
>stop("unknown `mode = '", mode, "'`")
>)
> }
> 
> … or similarly to write `fun <- if … else …`, assigning the result of the 
> `if` to `fun`. I guess this last formulation points to the fact that a more 
> careful analysis of Hervé's original code means that `fun` can only take one 
> value (only one branch of the `if` can be taken) so there can only be one 
> version of `fun` in any invocation of `toto()`.
> 
> Perhaps the local names (and string-valued 'mode') are suggestive of special 
> case, so serve as implicit documentation?
> 
> Adding `…` to `tata` doesn't seem like a good idea; toto(1)(3, 5, 7) no 
> longer signals an error.
> 
> There seems to be a lot in common with S3 and S4 methods, where `toto` 
> corresponds to the generic, `tata` and `titi` to methods. This 'dispatch' is 
> brought out by using `switch()`. There is plenty of opportunity for thinking 
> that you're invoking one method but actually you're invoking the other. For 
> instance with dplyr, I like that I can tbl |> print(n = 2) so much that I 
> find myself doing this with data.frame df |> print(n = 2), which is an error 
> (`n` partially matches `na.print`, and 2 is not a valid value); both methods 
> silently ignore the typo print(m = 2).
> 
> Martin Morgan
> 
> From: R-devel  on behalf of Henrik Bengtsson 
> 
> Date: Tuesday, February 6, 2024 at 4:34 PM
> To: Izmirlian, Grant (NIH/NCI) [E] 
> Cc: r-devel@r-project.org 
> Subject: Re: [Rd] [EXTERNAL] Re: NOTE: multiple local function definitions 
> for ?fun? with different formal arguments
> Here's a dummy example that I think illustrates the problem:
> 
> toto <- function() {
>  if (runif(1) < 0.5)
>function(a) a
>  else
>function(a,b) a+b
> }
> 
>> fcn <- toto()
>> fcn(1,2)
> [1] 3
>> fcn <- toto()
>> fcn(1,2)
> [1] 3
>> fcn <- toto()
>> fcn(1,2)
> Error in fcn(1, 2) : unused argument (2)
> 
> How can you use the returned function, if you get different arguments?
> 
> In your example, you cannot use the returned function without knowing
> 'mode', or by inspecting the returned function.  So, the warning is
> there to alert you to a potential bug.  Anecdotally, I'm pretty sure
> this R CMD check NOTE has caught at least one such bug in one of
> my/our packages.
> 
> If you want to keep the current design pattern, one approach could be
> to add ... to your function definitions:
> 
> toto <- function(mode)
> {
> if (mode == 1)
> fun <- function(a, b, ...) a*b
> else
> fun <- function(u, v, w) (u + v) / w
> fun
> }
>