Re: [Rd] paste(character(0), collapse="", recycle0=FALSE) should be ""

2020-05-21 Thread William Dunlap via R-devel
> 1) Bill and Hervé (I think) propose that 'recycle0' should have
>   no effect whenever  'collapse = '

I think that collapse= should make paste() return a single string,
regardless of the value of recycle0.  E.g., I would like to see

> paste0("X",seq_len(3),collapse=", ", recycle0=TRUE)
[1] "X1, X2, X3"
> paste0("X",seq_len(0),collapse=", ", recycle0=TRUE)
[1] ""

Currently the latter gives character(0).

paste's collapse argument has traditionally acted after all the other
arguments were dealt with, as in the following not extensively tested
function.

altPaste <- function (..., collapse = NULL) {
tmp <- paste(...)
if (!is.null(collapse)) {
paste(tmp, collapse=collapse)
} else {
tmp
}
}

E.g., in post-R-4.0.0 R-devel
> altPaste("X", seq_len(3), sep="", collapse=", ")
[1] "X1, X2, X3"
> altPaste("X", seq_len(0), sep="", collapse=", ")
[1] "X"
> altPaste("X", seq_len(0), sep="", collapse=", ", recycle0=TRUE)
[1] ""

I think it would be good if the above function continued to act the same as
paste itself.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, May 21, 2020 at 9:42 AM Martin Maechler 
wrote:

> > Hervé Pagès
> > on Fri, 15 May 2020 13:44:28 -0700 writes:
>
> > There is still the situation where **both** 'sep' and 'collapse' are
> > specified:
>
> >> paste(integer(0), "nth", sep="", collapse=",")
> > [1] "nth"
>
> > In that case 'recycle0' should **not** be ignored i.e.
>
> > paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)
>
> > should return the empty string (and not character(0) like it does at
> the
> > moment).
>
> > In other words, 'recycle0' should only control the first operation
> (the
> > operation controlled by 'sep'). Which makes plenty of sense: the 1st
> > operation is binary (or n-ary) while the collapse operation is
> unary.
> > There is no concept of recycling in the context of unary operations.
>
> Interesting, ..., and sounding somewhat convincing.
>
> > On 5/15/20 11:25, Gabriel Becker wrote:
> >> Hi all,
> >>
> >> This makes sense to me, but I would think that recycle0 and
> collapse
> >> should actually be incompatible and paste should throw an error if
> >> recycle0 were TRUE and collapse were declared in the same call. I
> don't
> >> think the value of recycle0 should be silently ignored if it is
> actively
> >> specified.
> >>
> >> ~G
>
> Just to summarize what I think we should know and agree (or be
> be "disproven") and where this comes from ...
>
> 1) recycle0 is a new R 4.0.0 option in paste() / paste0() which by default
>(recycle0 = FALSE) should (and *does* AFAIK) not change anything,
>hence  paste() / paste0() behave completely back-compatible
>if recycle0 is kept to FALSE.
>
> 2) recycle0 = TRUE is meant to give different behavior, notably
>0-length arguments (among '...') should result in 0-length results.
>
>The above does not specify what this means in detail, see 3)
>
> 3) The current R 4.0.0 implementation (for which I'm primarily responsible)
>and help(paste)  are in accordance.
>Notably the help page (Arguments -> 'recycle0' ; Details 1st para ;
> Examples)
>says and shows how the 4.0.0 implementation has been meant to work.
>
> 4) Several provenly smart members of the R community argue that
>both the implementation and the documentation of 'recycle0 =
>TRUE'  should be changed to be more logical / coherent / sensical ..
>
> Is the above all correct in your view?
>
> Assuming yes,  I read basically two proposals, both agreeing
> that  recycle0 = TRUE  should only ever apply to the action of 'sep'
> but not the action of 'collapse'.
>
> 1) Bill and Hervé (I think) propose that 'recycle0' should have
>no effect whenever  'collapse = '
>
> 2) Gabe proposes that 'collapse = ' and 'recycle0 = TRUE'
>should be declared incompatible and error. If going in that
>direction, I could also see them to give a warning (and
>continue as if recycle = FALSE).
>
> I have not yet my mind up but would tend to agree to "you guys",
> but I think that other R Core members should chime in, too.
>
> Martin
>
> >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès  >> > wrote:
> >>
> >> Totally agree with that.
> >>
> >> H.
> >>
> >> On 5/15/20 10:34, William Dunlap via R-devel wrote:
> >> > I agree: paste(collapse="something", ...) should always return a
> >> single
> >> > character string, regardless of the value of recycle0.  This
> would be
> >> > similar to when there are no non-NULL arguments to paste;
> >> collapse="."
> >> > gives a single empty string and collapse=NULL gives a zero long
> >> character
> >> > vector.
> >> >> paste()
> >> > character(0)
> >> >> paste(collapse=", ")
> >> > [1] ""
> >> >
> >> > Bill Dunlap
> >> > TIBCO Software
> >> > wdunla

Re: [Rd] paste(character(0), collapse="", recycle0=FALSE) should be ""

2020-05-21 Thread Martin Maechler
> Hervé Pagès 
> on Fri, 15 May 2020 13:44:28 -0700 writes:

> There is still the situation where **both** 'sep' and 'collapse' are 
> specified:

>> paste(integer(0), "nth", sep="", collapse=",")
> [1] "nth"

> In that case 'recycle0' should **not** be ignored i.e.

> paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)

> should return the empty string (and not character(0) like it does at the 
> moment).

> In other words, 'recycle0' should only control the first operation (the 
> operation controlled by 'sep'). Which makes plenty of sense: the 1st 
> operation is binary (or n-ary) while the collapse operation is unary. 
> There is no concept of recycling in the context of unary operations.

Interesting, ..., and sounding somewhat convincing.

> On 5/15/20 11:25, Gabriel Becker wrote:
>> Hi all,
>> 
>> This makes sense to me, but I would think that recycle0 and collapse 
>> should actually be incompatible and paste should throw an error if 
>> recycle0 were TRUE and collapse were declared in the same call. I don't 
>> think the value of recycle0 should be silently ignored if it is actively 
>> specified.
>> 
>> ~G

Just to summarize what I think we should know and agree (or be
be "disproven") and where this comes from ...

1) recycle0 is a new R 4.0.0 option in paste() / paste0() which by default
   (recycle0 = FALSE) should (and *does* AFAIK) not change anything,
   hence  paste() / paste0() behave completely back-compatible
   if recycle0 is kept to FALSE.

2) recycle0 = TRUE is meant to give different behavior, notably
   0-length arguments (among '...') should result in 0-length results.

   The above does not specify what this means in detail, see 3)

3) The current R 4.0.0 implementation (for which I'm primarily responsible)
   and help(paste)  are in accordance.
   Notably the help page (Arguments -> 'recycle0' ; Details 1st para ; Examples)
   says and shows how the 4.0.0 implementation has been meant to work.

4) Several provenly smart members of the R community argue that
   both the implementation and the documentation of 'recycle0 =
   TRUE'  should be changed to be more logical / coherent / sensical ..

Is the above all correct in your view?

Assuming yes,  I read basically two proposals, both agreeing
that  recycle0 = TRUE  should only ever apply to the action of 'sep'
but not the action of 'collapse'.

1) Bill and Hervé (I think) propose that 'recycle0' should have
   no effect whenever  'collapse = '

2) Gabe proposes that 'collapse = ' and 'recycle0 = TRUE'
   should be declared incompatible and error. If going in that
   direction, I could also see them to give a warning (and
   continue as if recycle = FALSE). 

I have not yet my mind up but would tend to agree to "you guys",
but I think that other R Core members should chime in, too.

Martin

>> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès > > wrote:
>> 
>> Totally agree with that.
>> 
>> H.
>> 
>> On 5/15/20 10:34, William Dunlap via R-devel wrote:
>> > I agree: paste(collapse="something", ...) should always return a
>> single
>> > character string, regardless of the value of recycle0.  This would be
>> > similar to when there are no non-NULL arguments to paste;
>> collapse="."
>> > gives a single empty string and collapse=NULL gives a zero long
>> character
>> > vector.
>> >> paste()
>> > character(0)
>> >> paste(collapse=", ")
>> > [1] ""
>> >
>> > Bill Dunlap
>> > TIBCO Software
>> > wdunlap tibco.com
>> 

>> >
>> >
>> > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via R-devel <
>> > r-devel@r-project.org > wrote:
>> >
>> >> Without 'collapse', 'paste' pastes (concatenates) its arguments
>> >> elementwise (separated by 'sep', " " by default). New in R devel
>> and R
>> >> patched, specifying recycle0 = FALSE makes mixing zero-length and
>> >> nonzero-length arguments results in length zero. The result of
>> paste(n,
>> >> "th", sep = "", recycle0 = FALSE) always have the same length as
>> 'n'.
>> >> Previously, the result is still as long as the longest argument,
>> with the
>> >> zero-length argument like "". If all og the arguments have
>> length zero,
>> >> 'recycle0' doesn't matter.
>> >>
>> >> As far as I understand, 'paste' with 'collapse' as a character
>> string is
>> >> supposed to put together elements of a vector into a single
>> character
>> >> string. I think 'recycle0' shouldn't change it.
>> >>
>> >> In current R d

Re: [Rd] Precision of function mean,bug?

2020-05-21 Thread Morgan Morgan
Sorry, posting back to the list.
Thank you all.
Morgan

On Thu, 21 May 2020, 16:33 Henrik Bengtsson, 
wrote:

> Hi.
>
> Good point and a good example. Feel free to post to the list. The purpose
> of my reply wasn't to take away Peter's point but to emphasize that
> base::mean() does a two-pass scan over the elements too lower the impact of
> addition of values with widely different values (classical problem in
> numerical analysis). But I can see how it may look like that.
>
> Cheers,
>
> Henrik
>
>
> On Thu, May 21, 2020, 03:21 Morgan Morgan 
> wrote:
>
>> Thank you Henrik for the feedback.
>> Note that for idx=4 and refine = TRUE,  your equality b==c is FALSE. I
>> think that as Peter said == can't be trusted with FP.
>> His example is good. Here is an even more shocking one.
>> a=0.786546798
>> b=a+ 1e6 -1e6
>> a==b
>> # [1] FALSE
>>
>> Best regards
>> Morgan Jacob
>>
>> On Wed, 20 May 2020, 20:18 Henrik Bengtsson, 
>> wrote:
>>
>>> On Wed, May 20, 2020 at 11:10 AM brodie gaslam via R-devel
>>>  wrote:
>>> >
>>> >  > On Wednesday, May 20, 2020, 7:00:09 AM EDT, peter dalgaard <
>>> pda...@gmail.com> wrote:
>>> > >
>>> > > Expected, see FAQ 7.31.
>>> > >
>>> > > You just can't trust == on FP operations. Notice also
>>> >
>>> > Additionally, since you're implementing a "mean" function you are
>>> testing
>>> > against R's mean, you might want to consider that R uses a two-pass
>>> > calculation[1] to reduce floating point precision error.
>>>
>>> This one is important.
>>>
>>> FWIW, matrixStats::mean2() provides argument refine=TRUE/FALSE to
>>> calculate mean with and without this two-pass calculation;
>>>
>>> > a <- c(x[idx],y[idx],z[idx]) / 3
>>> > b <- mean(c(x[idx],y[idx],z[idx]))
>>> > b == a
>>> [1] FALSE
>>> > b - a
>>> [1] 2.220446e-16
>>>
>>> > c <- matrixStats::mean2(c(x[idx],y[idx],z[idx]))  ## default to
>>> refine=TRUE
>>> > b == c
>>> [1] TRUE
>>> > b - c
>>> [1] 0
>>>
>>> > d <- matrixStats::mean2(c(x[idx],y[idx],z[idx]), refine=FALSE)
>>> > a == d
>>> [1] TRUE
>>> > a - d
>>> [1] 0
>>> > c == d
>>> [1] FALSE
>>> > c - d
>>> [1] 2.220446e-16
>>>
>>> Not surprisingly, the two-pass higher-precision version (refine=TRUE)
>>> takes roughly twice as long as the one-pass quick version
>>> (refine=FALSE).
>>>
>>> /Henrik
>>>
>>> >
>>> > Best,
>>> >
>>> > Brodie.
>>> >
>>> > [1]
>>> https://github.com/wch/r-source/blob/tags/R-4-0-0/src/main/summary.c#L482
>>> >
>>> > > > a2=(z[idx]+x[idx]+y[idx])/3
>>> > > > a2==a
>>> > > [1] FALSE
>>> > > > a2==b
>>> > > [1] TRUE
>>> > >
>>> > > -pd
>>> > >
>>> > > > On 20 May 2020, at 12:40 , Morgan Morgan <
>>> morgan.email...@gmail.com> wrote:
>>> > > >
>>> > > > Hello R-dev,
>>> > > >
>>> > > > Yesterday, while I was testing the newly implemented function
>>> pmean in
>>> > > > package kit, I noticed a mismatch in the output of the below R
>>> expressions.
>>> > > >
>>> > > > set.seed(123)
>>> > > > n=1e3L
>>> > > > idx=5
>>> > > > x=rnorm(n)
>>> > > > y=rnorm(n)
>>> > > > z=rnorm(n)
>>> > > > a=(x[idx]+y[idx]+z[idx])/3
>>> > > > b=mean(c(x[idx],y[idx],z[idx]))
>>> > > > a==b
>>> > > > # [1] FALSE
>>> > > >
>>> > > > For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and
>>> many
>>> > > > others the difference is small but still.
>>> > > > Is that expected or is it a bug?
>>> >
>>> > __
>>> > R-devel@r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] failed check in reg-tests-1b.R

2020-05-21 Thread Martin Maechler
> Benjamin Tyner 
> on Tue, 19 May 2020 22:36:16 -0400 writes:

> Not certain this is actually a bug, so posting here.
> I'm on Ubuntu 18.04.4 LTS, building R version 4.0.0. The "configure" and 
> "make" steps are successful, but the "make check" step fails when it 
> gets to this part of ./tests/reg-tests-1b.R:

>    > ## methods() gave two wrong warnings in some cases:
>    > op <- options(warn = 2)# no warning, please!
>    > m1 <- methods(na.omit) ## should give (no warning):
>    > ##
>    > setClass("bla")
>    > setMethod("na.omit", "bla", function(object, ...) "na.omit()")
>    Error: package 'codetools' was installed before R 4.0.0: please 
> re-install it
>    Execution halted

> It appears to be picking up the older version of codetools from $R_LIBS; 
> if I unset R_LIBS, then it works just fine.

> So I'm wondering, is it a bug, or is the user's own fault for having 
> R_LIBS set whilst trying to build R?

Well, currently it seems to be the user's fault as in
  "if you don't do it, everything is fine"

But it has bitten me too, many times actually,
when going from R 3.y.z to R 4.y.z .
I have not started to investigate what it would mean, or even
if it really makes sense to disregard R_LIBS in such situations.

For building (and checking!) R itself from the sources, it may
seem to make sense indeed if such environment variables would be
temporarily unset (by one of the Makefiles, say).

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Feature Request: User Prompt + Message First Execution when "Managing Search Path Conflicts"

2020-05-21 Thread Juan Telleria Ruiz de Aguirre
Got it, thank you for pointing out the solution then.

Best,
Juan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Feature Request: User Prompt + Message First Execution when "Managing Search Path Conflicts"

2020-05-21 Thread luke-tierney

I looks like you may have misunderstood my post so just to make sure:
There will be no patch to R to support this.

If this is something you want for yourself, then I have shown you how
you can do it.  You can put the code in a startup file if you like.

If you want your students to have this, then you can prepare a startup
file for them that does this.

Best,

luke

On Thu, 21 May 2020, Juan Telleria Ruiz de Aguirre wrote:


Thank you Mr. Tierney!

Using globalCallingHandlers() to directly handle
"packageConflictError" is an excellent idea!

The benefits I see for such an implementation are:
* The patch would be contained within the Conflict Error Handler,
which should reduce any side effects with an eventual implementation.
* And by making its usage optional, by setting for example
options(conflicts.policy.ask = TRUE), in should neither affect any
packages nor other base code.

Hope it allows R Users to work in a more agile manner, and guide R
Students through best practices of variable conflict handling in an
educative manner.

Thanks,
Juan


You can get what you are asking for now in R 4.0.0 with
globalCallingHandlers and using the packageConflictError object that
is signaled. This should get you started:

```
options(conflicts.policy = "strict")

packageConflictError

handle_conflicts <- function(e) {
 cat(conditionMessage(e))
 opt <- readline(prompt="1: mask.ok; 2: exclude. Choose: ")
 if (opt == "1")
 conflictRules(e$package, mask.ok = as.character(unlist(e$conflicts)))
 else if (opt == "2")
 conflictRules(e$package, exclude = as.character(unlist(e$conflicts)))
 stop("unresolved conflicts") ## ideal invode a restart here
}

globalCallingHandlers(packageConflictError = handle_conflicts)

library(dplyr)
```

An IDE could provide a more sophisticated interface, like a dialog
allowing separate choices for each conflict. But this is best left up
to the IDE or the user.

The one addition to library that might be worth considering is to
provide a restart for the handler to invoke.

Best,

luke





--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Patch proposal for bug 17770 - xtabs does not act as documented for na.action = na.pass

2020-05-21 Thread SOEIRO Thomas
Dear all,

(This issue was previously reported on Bugzilla 
(https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17770) and discussed on 
Stack Overflow (https://stackoverflow.com/q/61240049).)

The documentation of xtabs says:

"na.action: When it is na.pass and formula has a left hand side (with counts), 
sum(*, na.rm = TRUE) is used instead of sum(*) for the counts."

However, this is not the case:
 
DF <- data.frame(group = c("a", "a", "b", "b"),
 count = c(NA, TRUE, FALSE, TRUE))

xtabs(formula = count ~ group,
  data = DF,
  na.action = na.pass)

# group
# a b
# 1

In the code, na.rm is TRUE if and only if na.action = na.omit:

na.rm <- 
  identical(naAct, quote(na.omit)) || identical(naAct, na.omit) ||
  identical(naAct, "na.omit")

xtabs(formula = count ~ group,
  data = DF,
  na.action = na.omit)

# group
# a b
# 1 1

The example works as documented if we change the code to:

na.rm <- 
  identical(naAct, quote(na.pass)) || identical(naAct, na.pass) ||
  identical(naAct, "na.pass")

However, there may be something I am missing, and na.omit may be necessary for 
something else...

Best regards,

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Feature Request: User Prompt + Message First Execution when "Managing Search Path Conflicts"

2020-05-21 Thread Juan Telleria Ruiz de Aguirre
Thank you Mr. Tierney!

Using globalCallingHandlers() to directly handle
"packageConflictError" is an excellent idea!

The benefits I see for such an implementation are:
* The patch would be contained within the Conflict Error Handler,
which should reduce any side effects with an eventual implementation.
* And by making its usage optional, by setting for example
options(conflicts.policy.ask = TRUE), in should neither affect any
packages nor other base code.

Hope it allows R Users to work in a more agile manner, and guide R
Students through best practices of variable conflict handling in an
educative manner.

Thanks,
Juan

> You can get what you are asking for now in R 4.0.0 with
> globalCallingHandlers and using the packageConflictError object that
> is signaled. This should get you started:
>
> ```
> options(conflicts.policy = "strict")
>
> packageConflictError
>
> handle_conflicts <- function(e) {
>  cat(conditionMessage(e))
>  opt <- readline(prompt="1: mask.ok; 2: exclude. Choose: ")
>  if (opt == "1")
>  conflictRules(e$package, mask.ok = as.character(unlist(e$conflicts)))
>  else if (opt == "2")
>  conflictRules(e$package, exclude = as.character(unlist(e$conflicts)))
>  stop("unresolved conflicts") ## ideal invode a restart here
> }
>
> globalCallingHandlers(packageConflictError = handle_conflicts)
>
> library(dplyr)
> ```
>
> An IDE could provide a more sophisticated interface, like a dialog
> allowing separate choices for each conflict. But this is best left up
> to the IDE or the user.
>
> The one addition to library that might be worth considering is to
> provide a restart for the handler to invoke.
>
> Best,
>
> luke
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel