Re: [Rd] Use of C++ in Packages

2019-03-29 Thread Simon Urbanek
Kevin,


> On Mar 29, 2019, at 17:01, Kevin Ushey  wrote:
> 
> I think it's also worth saying that some of these issues affect C code
> as well; e.g. this is not safe:
> 
>FILE* f = fopen(...);
>Rf_eval(...);
>fclose(f);
> 

I fully agree, but developers using C are well aware of the necessity of 
handling lifespan of objects explicitly, so at least there are no surprises.


> whereas the C++ equivalent would likely handle closing of the file in the 
> destructor. In other words, I think many users just may not be cognizant of 
> the fact that most R APIs can longjmp, and what that implies for cleanup of 
> allocated resources. R_alloc() may help solve the issue specifically for 
> memory allocations, but for any library interface that has a 'open' and 
> 'close' step, the same sort of issue will arise.
> 

Well, I hope that anyone writing native code in package is well aware of that 
and will use an external pointer with finalizer to clean up native objects in 
any 3rd party library that are created during the call.


> What I believe we should do, and what Rcpp has made steps towards, is make it 
> possible to interact with some subset of the R API safely from C++ contexts. 
> This has always been possible with e.g. R_ToplevelExec() and 
> R_ExecWithCleanup(), and now things are even better with R_UnwindProtect(). 
> In theory, as a prototype, an R package could provide a 'safe' C++ interface 
> to the R API using R_UnwindProtect() and friends as appropriate, and client 
> packages could import and link to that package to gain access to the 
> interface. Code generators (as Rcpp Attributes does) can handle some of the 
> pain in these interfaces, so that users are mostly insulated from the nitty 
> gritty details.
> 

I agree that we should strive to provide tools that make it safer, but note 
that it still requires participation of the users - they have to use such 
facilities or else they hit the same problem. So we can only fix this for the 
future, but let's start now.


> I agree that the content of Tomas's post is very helpful, especially since I 
> expect many R programmers who dip their toes into the C++ world are not aware 
> of the caveats of talking to R from C++. However, I don't think it's helpful 
> to recommend "don't use C++"; rather, I believe the question should be, "what 
> can we do to make it possible to easily and safely interact with R from 
> C++?". Because, as I understand it, all of the problems raised are solvable: 
> either through a well-defined C++ interface, or through better education.
> 

I think the recommendation would be different if such tools existed, but they 
don't. It was based on the current reality which is not so rosy.  Apparently 
the post had its effect of mobilizing C++ proponents to do something about it, 
which is great, because if this leads to some solution, the recommendation in 
the future may change to "use C++ using tools XYZ".


> I'll add my own opinion: writing correct C code is an incredibly difficult 
> task. C++, while obviously not perfect, makes things substantially easier 
> with tools like RAII, the STL, smart pointers, and so on. And I strongly 
> believe that C++ (with Rcpp) is still a better choice than C for new users 
> who want to interface with R from compiled code.
> 

My take is that Rcpp makes the interface *look* easier, but you still have to 
understand more about the R API that you think. Hence it much easier to write 
buggy code. Personally, that's why I don't like it (apart from the code bloat), 
because things are hidden that will get you into trouble, whereas using the C 
API is at least very clear - you have to understand what it's doing when you 
use it. That said, I'm obviously biased since I know a lot about R internals ;) 
so this doesn't necessarily generalize.


> tl;dr: I (and I think most others) just wish the summary had a more positive 
> outlook for the future of C++ with R.
> 

Well, unless someone actually takes the initiative there is no reason to 
believe in a bright future of C++. As we have seen with the lack of adoption of 
CXXR (which I thought was an incredible achievement), not enough people seem to 
really care about C++. If that is not true, then let's come out of hiding, get 
together and address it (it seems that this thread is a good start).

Cheers,
Simon



> Best,
> Kevin
> 
> On Fri, Mar 29, 2019 at 10:16 AM Simon Urbanek
>  wrote:
>> 
>> Jim,
>> 
>> I think the main point of Tomas' post was to alert R users to the fact that 
>> there are very serious issues that you have to understand when interfacing R 
>> from C++. Using C++ code from R is fine, in many cases you only want to 
>> access R data, use some library or compute in C++ and return results. Such 
>> use-cases are completely fine in C++ as they don't need to trigger the 
>> issues mentioned and it should be made clear that it was not what Tomas' 
>> blog was about.
>> 
>> I agree with Tomas that it is safer to give an adv

Re: [Rd] Use of C++ in Packages

2019-03-29 Thread Gabriel Becker
Hi Jim (et al.),

Comments inline (and assume any offense was unintended, these kinds of
things can be tricky to talk about).

On Fri, Mar 29, 2019 at 8:19 AM Jim Hester  wrote:

> First, thank you to Tomas for writing his recent post[0] on the R
> developer blog. It raised important issues in interfacing R's C API
> and C++ code.
>
> However I do _not_ think the conclusion reached in the post is helpful
>   > don’t use C++ to interface with R
>

I was a bit surprised a the the strength of this too but its understandable
given the content/motivation of the post.

My personal take away, without putting any words in Tomas' or R-core's
mouths at all, is that the crux here is that using c++ in R packages safely
is a LOT less trivial than people in the wider R community think it is,
these days. Or rather, there are things you can do safely quite easily when
using c++ in an R package, and things you can't, but that distincton a)
isn't really on many people's radar, and b) isn't super trivial to identify
at any given time, and c) depends on internal implementation details so
isn't stable / safe to rely on across time anyway. There are a lot of
reasons for a), and none of them, nor anything else I'm about to say,
constitute criticisms of Rcpp or its developers.

I've always thought that we as tool/software developers in this space
should make things seem as easy and convenient to users as they
can/intrinsically are, *but not easier*. I don't know how popular that
second part I put in there is generally, but personally I think its true
and pretty important not to leave off. I read Tomas' past as suggesting
that as a community, without pointing fingers or laying any individual
blame,  have unintentionally crossed "as easy as it actually is/can be to
do right" line when it comes to the impression we give to novice/journeyman
package developers regarding using c++to interact with the R internals. I
honestly claim little familiarity with c++ but it seems like Tomas is the
relevant expert on both it and hard-core details about how aspects of the R
internals work so if he tells us that that has happened, we should probably
listen.


> There are now more than 1,600 packages on CRAN using C++, the time is
> long past when that type of warning is going to be useful to the R
> community.
>

Here I disagree here pretty strongly. I think the warning is very useful -
unless these issues were widely known before the post (my impression is
that they weren't) - and ignoring its contents or encouraging others to do
so as influential members of the R community would be irresponsible.

I mean, the reality of the situation as it exists now is more or less (I'd
assume a great deal 'more' than 'less', personally) what Tomas described,
right? Furthermore, regardless of what changes may come in the future, it
seems very unlikely any of them will be in this coming release (since grand
feature freeze is like, today?) so we're talking a year out, at LEAST.
Given that, this advice, or at least a more nuanced stance that gives the
information from the post proper weight and is different from the
prevailing sentiment now, basically has to be realistic in the short term.

At the very least I think the post tells us that we need to be really
careful as a community with the "you want speed throw some c++ in your
package at it, you can learn how in a day and it's super easy and basically
free" messaging. The reality is more nuanced than that, at best, even if
ultimately in many situations that is a valid/reasonable approach.


>
> These same issues will also occur with any newer language (such as
> Rust or Julia[1]) which uses RAII to manage resources and tries to
> interface with R. It doesn't seem a productive way forward for R to
> say it can't interface with these languages without first doing
> expensive copies into an intermediate heap.
>
> The advice to avoid C++ is also antithetical to John Chambers vision
> of first S and R as a interface language (from Extending R [2])
>
>   > The *interface* principle has always been central to R and to S
> before. An interface to subroutines was _the_ way to extend the first
> version of S. Subroutine interfaces have continued to be central to R.
>
> The book also has extensive sections on both C++ (via Rcpp) and Julia,
> so clearly John thinks these are legitimate ways to extend R.
>
> So if 'don't use C++' is not realistic and the current R API does not
> allow safe use of C++ exceptions what are the alternatives?
>

Again, nothing is going to change about this for a year*, at least *(AFAIK,
not on R-core) so we have to make it at least somewhat realistic; perhaps
not the blanket moratorium that Tomas advocated - though IMHO statements
from R-core about what is safe/supported when operating in R arena should
be granted *a lot *of weight - but certainly not the prevailing sentiment
it was responding to, either. That is true even if we commit to also
looking for ways to improve the situation in the long

Re: [Rd] default for 'signif.stars'

2019-03-29 Thread Abs Spurdle
> If we were to invent lm() now, how would we solve the problem of big P?
> I don't think we would use stars.

Assuming that this is a good idea in the first place, here's a simple
solution, in the context of backward selection.

One could sort the terms, from lowest p-value to highest p-value.
If each variable is associated with more than one parameter (e.g.
interactions), then it complicates things, however, the same principle
applies.

It would be possible to group terms, based on their significance level,
however, this is unlikely to be popular. You could also use a head() and
tail() approach, something I've been using a lot, in other contexts.

However, I think a better solution is to automate the backward selection
process, however, that requires decision rules, and we're back to the
original problem.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Use of C++ in Packages

2019-03-29 Thread Kevin Ushey
I think it's also worth saying that some of these issues affect C code
as well; e.g. this is not safe:

FILE* f = fopen(...);
Rf_eval(...);
fclose(f);

whereas the C++ equivalent would likely handle closing of the file in
the destructor. In other words, I think many users just may not be
cognizant of the fact that most R APIs can longjmp, and what that
implies for cleanup of allocated resources. R_alloc() may help solve
the issue specifically for memory allocations, but for any library
interface that has a 'open' and 'close' step, the same sort of issue
will arise.

What I believe we should do, and what Rcpp has made steps towards, is
make it possible to interact with some subset of the R API safely from
C++ contexts. This has always been possible with e.g. R_ToplevelExec()
and R_ExecWithCleanup(), and now things are even better with
R_UnwindProtect(). In theory, as a prototype, an R package could
provide a 'safe' C++ interface to the R API using R_UnwindProtect()
and friends as appropriate, and client packages could import and link
to that package to gain access to the interface. Code generators (as
Rcpp Attributes does) can handle some of the pain in these interfaces,
so that users are mostly insulated from the nitty gritty details.

I agree that the content of Tomas's post is very helpful, especially
since I expect many R programmers who dip their toes into the C++
world are not aware of the caveats of talking to R from C++. However,
I don't think it's helpful to recommend "don't use C++"; rather, I
believe the question should be, "what can we do to make it possible to
easily and safely interact with R from C++?". Because, as I understand
it, all of the problems raised are solvable: either through a
well-defined C++ interface, or through better education.

I'll add my own opinion: writing correct C code is an incredibly
difficult task. C++, while obviously not perfect, makes things
substantially easier with tools like RAII, the STL, smart pointers,
and so on. And I strongly believe that C++ (with Rcpp) is still a
better choice than C for new users who want to interface with R from
compiled code.

tl;dr: I (and I think most others) just wish the summary had a more
positive outlook for the future of C++ with R.

Best,
Kevin

On Fri, Mar 29, 2019 at 10:16 AM Simon Urbanek
 wrote:
>
> Jim,
>
> I think the main point of Tomas' post was to alert R users to the fact that 
> there are very serious issues that you have to understand when interfacing R 
> from C++. Using C++ code from R is fine, in many cases you only want to 
> access R data, use some library or compute in C++ and return results. Such 
> use-cases are completely fine in C++ as they don't need to trigger the issues 
> mentioned and it should be made clear that it was not what Tomas' blog was 
> about.
>
> I agree with Tomas that it is safer to give an advice to not use C++ to call 
> R API since C++ may give a false impression that you don't need to know what 
> you're doing. Note that it is possible to avoid longjmps by using 
> R_ExecWithCleanup() which can catch any longjmps from the called function. So 
> if you know what you're doing you can make things work. I think the issue 
> here is not necessarily lack of tools, it is lack of knowledge - which is why 
> I think Tomas' post is so important.
>
> Cheers,
> Simon
>
>
> > On Mar 29, 2019, at 11:19 AM, Jim Hester  wrote:
> >
> > First, thank you to Tomas for writing his recent post[0] on the R
> > developer blog. It raised important issues in interfacing R's C API
> > and C++ code.
> >
> > However I do _not_ think the conclusion reached in the post is helpful
> >> don’t use C++ to interface with R
> >
> > There are now more than 1,600 packages on CRAN using C++, the time is
> > long past when that type of warning is going to be useful to the R
> > community.
> >
> > These same issues will also occur with any newer language (such as
> > Rust or Julia[1]) which uses RAII to manage resources and tries to
> > interface with R. It doesn't seem a productive way forward for R to
> > say it can't interface with these languages without first doing
> > expensive copies into an intermediate heap.
> >
> > The advice to avoid C++ is also antithetical to John Chambers vision
> > of first S and R as a interface language (from Extending R [2])
> >
> >> The *interface* principle has always been central to R and to S
> > before. An interface to subroutines was _the_ way to extend the first
> > version of S. Subroutine interfaces have continued to be central to R.
> >
> > The book also has extensive sections on both C++ (via Rcpp) and Julia,
> > so clearly John thinks these are legitimate ways to extend R.
> >
> > So if 'don't use C++' is not realistic and the current R API does not
> > allow safe use of C++ exceptions what are the alternatives?
> >
> > One thing we could do is look how this is handled in other languages
> > written in C which also use longjmp for errors.
> >
> > Lua is one example

Re: [Rd] Use of C++ in Packages

2019-03-29 Thread Simon Urbanek
Jim,

I think the main point of Tomas' post was to alert R users to the fact that 
there are very serious issues that you have to understand when interfacing R 
from C++. Using C++ code from R is fine, in many cases you only want to access 
R data, use some library or compute in C++ and return results. Such use-cases 
are completely fine in C++ as they don't need to trigger the issues mentioned 
and it should be made clear that it was not what Tomas' blog was about.

I agree with Tomas that it is safer to give an advice to not use C++ to call R 
API since C++ may give a false impression that you don't need to know what 
you're doing. Note that it is possible to avoid longjmps by using 
R_ExecWithCleanup() which can catch any longjmps from the called function. So 
if you know what you're doing you can make things work. I think the issue here 
is not necessarily lack of tools, it is lack of knowledge - which is why I 
think Tomas' post is so important.

Cheers,
Simon


> On Mar 29, 2019, at 11:19 AM, Jim Hester  wrote:
> 
> First, thank you to Tomas for writing his recent post[0] on the R
> developer blog. It raised important issues in interfacing R's C API
> and C++ code.
> 
> However I do _not_ think the conclusion reached in the post is helpful
>> don’t use C++ to interface with R
> 
> There are now more than 1,600 packages on CRAN using C++, the time is
> long past when that type of warning is going to be useful to the R
> community.
> 
> These same issues will also occur with any newer language (such as
> Rust or Julia[1]) which uses RAII to manage resources and tries to
> interface with R. It doesn't seem a productive way forward for R to
> say it can't interface with these languages without first doing
> expensive copies into an intermediate heap.
> 
> The advice to avoid C++ is also antithetical to John Chambers vision
> of first S and R as a interface language (from Extending R [2])
> 
>> The *interface* principle has always been central to R and to S
> before. An interface to subroutines was _the_ way to extend the first
> version of S. Subroutine interfaces have continued to be central to R.
> 
> The book also has extensive sections on both C++ (via Rcpp) and Julia,
> so clearly John thinks these are legitimate ways to extend R.
> 
> So if 'don't use C++' is not realistic and the current R API does not
> allow safe use of C++ exceptions what are the alternatives?
> 
> One thing we could do is look how this is handled in other languages
> written in C which also use longjmp for errors.
> 
> Lua is one example, they provide an alternative interface;
> lua_pcall[3] and lua_cpcall[4] which wrap a normal lua call and return
> an error code rather long jumping. These interfaces can then be safely
> wrapped by RAII - exception based languages.
> 
> This alternative error code interface is not just useful for C++, but
> also for resource cleanup in C, it is currently non-trivial to handle
> cleanup in all the possible cases a longjmp can occur (interrupts,
> warnings, custom conditions, timeouts any allocation etc.) even with R
> finalizers.
> 
> It is past time for R to consider a non-jumpy C interface, so it can
> continue to be used as an effective interface to programming routines
> in the years to come.
> 
> [0]: 
> https://developer.r-project.org/Blog/public/2019/03/28/use-of-c---in-packages/
> [1]: https://github.com/JuliaLang/julia/issues/28606
> [2]: https://doi.org/10.1201/9781315381305
> [3]: http://www.lua.org/manual/5.1/manual.html#lua_pcall
> [4]: http://www.lua.org/manual/5.1/manual.html#lua_cpcall
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in the "reformulate" function in stats package

2019-03-29 Thread Ben Bolker
  I suspect that the issue is addressed (obliquely) in the examples,
which shows that variables with spaces in them (or otherwise
'non-syntactic', i.e. not satisfying the constraints of legal R symbols)
can be handled by protecting them with backticks  (``)

 ## using non-syntactic names:
 reformulate(c("`P/E`", "`% Growth`"), response = as.name("+-"))

It seems to me there could be room for a *documentation* patch (stating
explicitly that if termlabels has length > 1 its elements are
concatenated with "+", and explicitly stating that non-syntactic names
must be protected with back-ticks).  (There is a little bit of obscurity
in the fact that the elements of termlabels don't have to be
syntactically valid names: many will be included in formulas if they can
be interpreted as *parseable* expressions, e.g. reformulate("x<2"))

  I would be happy to give it a shot if the consensus is that it would
be worthwhile.

   One workaround to the OP's problem is below (may be worth including
as an example in docs)

> z <- c("a variable","another variable")
> reformulate(z)
Error in parse(text = termtext, keep.source = FALSE) :
  :1:6: unexpected symbol
1:  ~ a variable
 ^
> reformulate(sprintf("`%s`",z))
~`a variable` + `another variable`




On 2019-03-29 11:54 a.m., J C Nash wrote:
> The main thing is to post the "small reproducible example".
> 
> My (rather long term experience) can be written
> 
>   if (exists("reproducible example") ) {
>  DeveloperFixHappens()
>   } else {
>  NULL
>   }
> 
> JN
> 
> On 2019-03-29 11:38 a.m., Saren Tasciyan wrote:
>> Well, first I can't sign in bugzilla myself, that is why I wrote here first. 
>> Also, I don't know if I have the time at
>> the moment to provide tests, multiple examples or more. If that is not ok or 
>> welcomed, that is fine, I can come back,
>> whenever I have more time to properly report the bug.
>>
>> I didn't find the existing bug report, sorry for that.
>>
>> Yes, it is related. My problem was that I have column names with spaces and 
>> current solution doesn't solve it. I have a
>> solution, which works for me and maybe also for others.
>>
>> Either, someone can register me to bugzilla or I can post it here, which 
>> could give some direction to developers. I
>> don't mind whichever is preferred here.
>>
>> Best,
>>
>> Saren
>>
>>
>> On 29.03.19 09:29, Martin Maechler wrote:
 Saren Tasciyan
  on Thu, 28 Mar 2019 17:02:10 +0100 writes:
>>>  > Hi,
>>>  > I have found a bug in reformulate function and have a solution for 
>>> it. I
>>>  > was wondering, where I can submit it?
>>>
>>>  > Best,
>>>  > Saren
>>>
>>>
>>> Well, you could have given a small reproducible example
>>> depicting the bug, notably when posting here:
>>> Just a prose text with no R code or other technical content is
>>> almost always not really appropriate fo the R-devel mailing list.
>>>
>>> Further, in such a case you should google a bit and hopefully
>>> have found
>>>     https://www.r-project.org/bugs.html
>>>
>>> which also mention reproducibility (and many more useful things).
>>>
>>> Then it also tells you about R's bug repository, also called
>>> "R's bugzilla" at https://bugs.r-project.org/
>>>
>>> and if you are diligent (but here, I'd say bugzilla is
>>> (configured?) far from ideal), you'd also find bug PR#17359
>>>
>>>     https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17359
>>>
>>> which was reported already on Nov 2017 .. and only fixed
>>> yesterday (in the "cleanup old bugs" process that happens
>>> often before the big new spring release of R).
>>>
>>> So is your bug the same as that one?
>>>
>>> Martin
>>>
>>>  > --
>>>  > Saren Tasciyan
>>>  > /PhD Student / Sixt Group/
>>>  > Institute of Science and Technology Austria
>>>  > Am Campus 1
>>>  > 3400 Klosterneuburg, Austria
>>>
>>>  > __
>>>  > R-devel@r-project.org mailing list
>>>  > https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in the "reformulate" function in stats package

2019-03-29 Thread J C Nash
The main thing is to post the "small reproducible example".

My (rather long term experience) can be written

  if (exists("reproducible example") ) {
 DeveloperFixHappens()
  } else {
 NULL
  }

JN

On 2019-03-29 11:38 a.m., Saren Tasciyan wrote:
> Well, first I can't sign in bugzilla myself, that is why I wrote here first. 
> Also, I don't know if I have the time at
> the moment to provide tests, multiple examples or more. If that is not ok or 
> welcomed, that is fine, I can come back,
> whenever I have more time to properly report the bug.
> 
> I didn't find the existing bug report, sorry for that.
> 
> Yes, it is related. My problem was that I have column names with spaces and 
> current solution doesn't solve it. I have a
> solution, which works for me and maybe also for others.
> 
> Either, someone can register me to bugzilla or I can post it here, which 
> could give some direction to developers. I
> don't mind whichever is preferred here.
> 
> Best,
> 
> Saren
> 
> 
> On 29.03.19 09:29, Martin Maechler wrote:
>>> Saren Tasciyan
>>>  on Thu, 28 Mar 2019 17:02:10 +0100 writes:
>>  > Hi,
>>  > I have found a bug in reformulate function and have a solution for 
>> it. I
>>  > was wondering, where I can submit it?
>>
>>  > Best,
>>  > Saren
>>
>>
>> Well, you could have given a small reproducible example
>> depicting the bug, notably when posting here:
>> Just a prose text with no R code or other technical content is
>> almost always not really appropriate fo the R-devel mailing list.
>>
>> Further, in such a case you should google a bit and hopefully
>> have found
>>     https://www.r-project.org/bugs.html
>>
>> which also mention reproducibility (and many more useful things).
>>
>> Then it also tells you about R's bug repository, also called
>> "R's bugzilla" at https://bugs.r-project.org/
>>
>> and if you are diligent (but here, I'd say bugzilla is
>> (configured?) far from ideal), you'd also find bug PR#17359
>>
>>     https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17359
>>
>> which was reported already on Nov 2017 .. and only fixed
>> yesterday (in the "cleanup old bugs" process that happens
>> often before the big new spring release of R).
>>
>> So is your bug the same as that one?
>>
>> Martin
>>
>>  > --
>>  > Saren Tasciyan
>>  > /PhD Student / Sixt Group/
>>  > Institute of Science and Technology Austria
>>  > Am Campus 1
>>  > 3400 Klosterneuburg, Austria
>>
>>  > __
>>  > R-devel@r-project.org mailing list
>>  > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in the "reformulate" function in stats package

2019-03-29 Thread Saren Tasciyan
Well, first I can't sign in bugzilla myself, that is why I wrote here 
first. Also, I don't know if I have the time at the moment to provide 
tests, multiple examples or more. If that is not ok or welcomed, that is 
fine, I can come back, whenever I have more time to properly report the bug.


I didn't find the existing bug report, sorry for that.

Yes, it is related. My problem was that I have column names with spaces 
and current solution doesn't solve it. I have a solution, which works 
for me and maybe also for others.


Either, someone can register me to bugzilla or I can post it here, which 
could give some direction to developers. I don't mind whichever is 
preferred here.


Best,

Saren


On 29.03.19 09:29, Martin Maechler wrote:

Saren Tasciyan
 on Thu, 28 Mar 2019 17:02:10 +0100 writes:

 > Hi,
 > I have found a bug in reformulate function and have a solution for it. I
 > was wondering, where I can submit it?

 > Best,
 > Saren


Well, you could have given a small reproducible example
depicting the bug, notably when posting here:
Just a prose text with no R code or other technical content is
almost always not really appropriate fo the R-devel mailing list.

Further, in such a case you should google a bit and hopefully
have found
https://www.r-project.org/bugs.html

which also mention reproducibility (and many more useful things).

Then it also tells you about R's bug repository, also called
"R's bugzilla" at https://bugs.r-project.org/

and if you are diligent (but here, I'd say bugzilla is
(configured?) far from ideal), you'd also find bug PR#17359

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17359

which was reported already on Nov 2017 .. and only fixed
yesterday (in the "cleanup old bugs" process that happens
often before the big new spring release of R).

So is your bug the same as that one?

Martin

 > --
 > Saren Tasciyan
 > /PhD Student / Sixt Group/
 > Institute of Science and Technology Austria
 > Am Campus 1
 > 3400 Klosterneuburg, Austria

 > __
 > R-devel@r-project.org mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-devel

--
Saren Tasciyan
/PhD Student / Sixt Group/
Institute of Science and Technology Austria
Am Campus 1
3400 Klosterneuburg, Austria


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Use of C++ in Packages

2019-03-29 Thread Jim Hester
First, thank you to Tomas for writing his recent post[0] on the R
developer blog. It raised important issues in interfacing R's C API
and C++ code.

However I do _not_ think the conclusion reached in the post is helpful
  > don’t use C++ to interface with R

There are now more than 1,600 packages on CRAN using C++, the time is
long past when that type of warning is going to be useful to the R
community.

These same issues will also occur with any newer language (such as
Rust or Julia[1]) which uses RAII to manage resources and tries to
interface with R. It doesn't seem a productive way forward for R to
say it can't interface with these languages without first doing
expensive copies into an intermediate heap.

The advice to avoid C++ is also antithetical to John Chambers vision
of first S and R as a interface language (from Extending R [2])

  > The *interface* principle has always been central to R and to S
before. An interface to subroutines was _the_ way to extend the first
version of S. Subroutine interfaces have continued to be central to R.

The book also has extensive sections on both C++ (via Rcpp) and Julia,
so clearly John thinks these are legitimate ways to extend R.

So if 'don't use C++' is not realistic and the current R API does not
allow safe use of C++ exceptions what are the alternatives?

One thing we could do is look how this is handled in other languages
written in C which also use longjmp for errors.

Lua is one example, they provide an alternative interface;
lua_pcall[3] and lua_cpcall[4] which wrap a normal lua call and return
an error code rather long jumping. These interfaces can then be safely
wrapped by RAII - exception based languages.

This alternative error code interface is not just useful for C++, but
also for resource cleanup in C, it is currently non-trivial to handle
cleanup in all the possible cases a longjmp can occur (interrupts,
warnings, custom conditions, timeouts any allocation etc.) even with R
finalizers.

It is past time for R to consider a non-jumpy C interface, so it can
continue to be used as an effective interface to programming routines
in the years to come.

[0]: 
https://developer.r-project.org/Blog/public/2019/03/28/use-of-c---in-packages/
[1]: https://github.com/JuliaLang/julia/issues/28606
[2]: https://doi.org/10.1201/9781315381305
[3]: http://www.lua.org/manual/5.1/manual.html#lua_pcall
[4]: http://www.lua.org/manual/5.1/manual.html#lua_cpcall

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in the "reformulate" function in stats package

2019-03-29 Thread Martin Maechler
> Saren Tasciyan 
> on Thu, 28 Mar 2019 17:02:10 +0100 writes:

> Hi,
> I have found a bug in reformulate function and have a solution for it. I 
> was wondering, where I can submit it?

> Best,
> Saren


Well, you could have given a small reproducible example
depicting the bug, notably when posting here: 
Just a prose text with no R code or other technical content is
almost always not really appropriate fo the R-devel mailing list.

Further, in such a case you should google a bit and hopefully
have found
https://www.r-project.org/bugs.html

which also mention reproducibility (and many more useful things).

Then it also tells you about R's bug repository, also called
"R's bugzilla" at https://bugs.r-project.org/

and if you are diligent (but here, I'd say bugzilla is
(configured?) far from ideal), you'd also find bug PR#17359

   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17359

which was reported already on Nov 2017 .. and only fixed
yesterday (in the "cleanup old bugs" process that happens
often before the big new spring release of R).

So is your bug the same as that one?

Martin

> -- 
> Saren Tasciyan
> /PhD Student / Sixt Group/
> Institute of Science and Technology Austria
> Am Campus 1
> 3400 Klosterneuburg, Austria

> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Bug in the "reformulate" function in stats package

2019-03-29 Thread Saren Tasciyan

Hi,

I have found a bug in reformulate function and have a solution for it. I 
was wondering, where I can submit it?


Best,

Saren

--
Saren Tasciyan
/PhD Student / Sixt Group/
Institute of Science and Technology Austria
Am Campus 1
3400 Klosterneuburg, Austria

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] default for 'signif.stars'

2019-03-29 Thread Andrew Robinson
Hi Martin,

I take your point - but I'd argue that significance stars are a clumsy
solution to the very real problem that you outline, and their inclusion as
a default sends a signal about their appropriateness that I would prefer R
not to endorse.

My preference (to the extent that it matters) would be to see the
significance stars be an option but not a default one, and the addition of
different functionality to handle the many-predictor problem, perhaps a new
summary that more efficiently provides more useful information.

If we were to invent lm() now, how would we solve the problem of big P?  I
don't think we would use stars.

Cheers,

Andrew




On Thu, 28 Mar 2019 at 20:19, Martin Maechler 
wrote:

> > Lenth, Russell V
> > on Wed, 27 Mar 2019 00:06:08 + writes:
>
> > Dear R-Devel, As I am sure many of you know, a special
> > issue of The American Statistician just came out, and its
> > theme is the [mis]use of P values and the many common ways
> > in which they are abused. The lead editorial in that issue
> > mentions the 2014 ASA guidelines on P values, and goes one
> > step further, by now recommending that the words
> > "statistically significant" and related simplistic
> > interpretations no longer be used. There is much
> > discussion of the problems with drawing "bright lines"
> > concerning P values.
>
> > This is the position of a US society, but my sense is that
> > the statistical community worldwide is pretty much on the
> > same page.
>
> > Meanwhile, functions such as 'print.summary.lm' and
> > 'print.anova' have an argument 'signif.stars' that really
> > does involve drawing bright lines when it is set to
> > TRUE. And the default setting for the "show.signif.stars"
> > option is TRUE. Isn't it time to at least make
> > "show.signif.stars" default to FALSE? And, indeed, to
> > consider deprecating those 'signif.stars' options
> > altogether?
>
> Dear Russ,
> Abs has already given good reasons why this article may well be
> considered problematic.
>
> However, I think you and (many but not all) others who've raised
> this issue before you, slightly miss the following point.
>
> If p-values are misleading they should not be shown (and hence
> the signif.stars neither.
> That has been the approach adopted e.g., in the lme4 package
> *AND* has been an approach originally used in S and I think
> parts of R as well, in more places than now, notably, e.g., for
> print( summary() ).
>
> Fact is that users will write wrappers and their own packages
> just to get to p values, even in very doubtful cases...
> But anyway that (p values or not) is a different discussion
> which has some value.
>
> You however focus on the "significance stars".  I've argued for
> years why they are useful, as they are just a simple
> visualization of p values, and saving a lot of human time when
> there are many (fixed) effects looked at simultaneously.
> Why should users have to visually scan 20 or 50 numbers?  In
> modern Data analysis they should never have to but rather look
> at a visualization of those numbers. ... and that's what
> significance stars are, not more, nor less.
>
> Martin
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel