Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Hadley Wickham
>> There is package.skeleton() in base R as you already mentioned. It drove
>> me
>> bonkers that it creates packages which then fail R CMD check, so I wrote a
>> wrapper package (pkgKitten) with another helper function (kitten()) which
>> calls the base R helper and then cleans up it---but otherwise remains
>> faithful to it.
>
>
> Failing R CMD check isn't a big deal:  you want to be reminded to edit those
> incomplete help files.  But I think I recall that you couldn't even build
> the package that package.skeleton() created, and that indeed would be
> irritating, especially if you had a lot of functions so you had a lot of
> cleanup to do.  I don't know if that's still true because I generally use
> RStudio to create the initial package structure rather than calling
> package.skeleton myself.

Personally, I think the biggest problem with package.skeleton() is
that it assumes that the source of truth is objects in an environment.
This seems the wrong way around to me.

Hadley

-- 
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Hadley Wickham
On Tue, Jan 30, 2018 at 4:55 PM, Duncan Murdoch
 wrote:
> On 30/01/2018 4:30 PM, Kenny Bell wrote:
>>
>> In response to Duncan regarding the use of roxygen2 from the point of view
>> of a current user, I believe the issue he brings up is one of correlation
>> rather than causation.
>
>
> Could be.  However, I think editing comments in a .R file is a bit harder
> than editing text in a .Rd file, so I think the format discourages editing.
> I think it does make it easier to pass R CMD check the first time, but I
> don't think you should be satisfied with that.

One counter-point: I find it much easier to remember to update the
documentation when you update the code, if the code and the
documentation are very close together. I think mingling code and
documentation in the same file does add a subtle pressure to write
shorter docs, but I'm not entirely sure that's a bad thing - for long
form writing, vignettes are a much better solution anyway (since you
often want to mingle code and explanation).

Personally, I don't find writing in comments any harder than writing
in .Rd files, especially now that you can write in markdown and have
it automatically translated to Rd formatting commands.  And on the
negative side of Rd, I find it frustrating to have to copy and paste
the function definition to the usage section every time I modify an
argument. It just feels like unnecessary busywork that the computer
should be able to do for me (although I do understand why it is not
possible).

>> Writing my first piece of R documentation was made much easier by using
>> roxygen2, and it shallowed the learning curve substantially.
>
> I'm not completely up to date on Roxygen2 these days:  can you do some pages
> in Rd, others in Roxygen?  That's not quite as good as being able to switch
> back and forth, but it would allow you to start in Roxygen, then gradually
> move to Rd when editing there was easier.

Yes, that's possible, and to protect you in mixed environment,
roxygen2 will never overwrite a file that it did not itself create.

Hadley

-- 
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Duncan Murdoch

On 30/01/2018 4:12 PM, Dirk Eddelbuettel wrote:


Mehmet,

That is a loaded topic, not unlikely other topics preoccupying us these days.

There is package.skeleton() in base R as you already mentioned. It drove me
bonkers that it creates packages which then fail R CMD check, so I wrote a
wrapper package (pkgKitten) with another helper function (kitten()) which
calls the base R helper and then cleans up it---but otherwise remains
faithful to it.


Failing R CMD check isn't a big deal:  you want to be reminded to edit 
those incomplete help files.  But I think I recall that you couldn't 
even build the package that package.skeleton() created, and that indeed 
would be irritating, especially if you had a lot of functions so you had 
a lot of cleanup to do.  I don't know if that's still true because I 
generally use RStudio to create the initial package structure rather 
than calling package.skeleton myself.


Duncan Murdoch



These days pkgKitten defaults to creating per-package top-level help page
that just references content from DESCRIPTION via a set of newer Rd macros as
I find that helps keeping them aligned. The most recent example of mine is
   https://github.com/eddelbuettel/prrd/blob/master/man/prrd-package.Rd
I use either this function or the RStudio template helper all the time.

And similarly, other people written similar helpers. You may get other pointers.

And every couple of months someone writes a new tutorial about how to write a
first package. Then social media goes gaga and we get half a dozen blog posts
where someone celebrates finding said tutorial, reading it and following
through with a new package.

And many of us taught workshops on creating packages. There is a lot of
material out here, though lots of this material seems to be entirely ignorant
of what came before it.

And there has been lots, including Fritz's tutorial from a decade ago:

 https://epub.ub.uni-muenchen.de/6175/  as well as on CRAN as
 https://cran.r-project.org/doc/contrib/Leisch-CreatingPackages.pdf

So I'd recommend you just experiment and set up your own helpers. After all
the rule still holds: Anything you do more than three times should be a
function, and every function should be in a package. So customize _your_
function to create your package.

Dirk



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Duncan Murdoch

On 30/01/2018 4:30 PM, Kenny Bell wrote:

In response to Duncan regarding the use of roxygen2 from the point of view
of a current user, I believe the issue he brings up is one of correlation
rather than causation.


Could be.  However, I think editing comments in a .R file is a bit 
harder than editing text in a .Rd file, so I think the format 
discourages editing.  I think it does make it easier to pass R CMD check 
the first time, but I don't think you should be satisfied with that.


What would probably change my mind would be a two-way (or multi-way) 
tool:  it takes input in Roxygen comments or Rd files (or something 
else), and produces another format.  Then I'd probably choose to write 
the first pass in Roxygen, and convert to Rd for editing.  Other people 
might go in the opposite direction.  Or someone might write a fancy 
WYSIWYG editor for people who like that style of editing.


A couple of years ago I was hoping someone would figure out a way to 
create help page input in R Markdown, but I think that's tricky because 
of the lack of semantic markup there.  There *was* a project last year 
to work on other input methods; I dropped out before it got very far, 
and I don't know its current status.




Writing my first piece of R documentation was made much easier by using
roxygen2, and it shallowed the learning curve substantially.


I'm not completely up to date on Roxygen2 these days:  can you do some 
pages in Rd, others in Roxygen?  That's not quite as good as being able 
to switch back and forth, but it would allow you to start in Roxygen, 
then gradually move to Rd when editing there was easier.




What Duncan may be observing is a general tendency of roxygen2 users to
write overly concise documentation. However, I believe this to be caused by
an omitted variable - likely the tendency of roxygen2 users to want outputs
quickly. I can't see anything in roxygen2 that might suggest a causal link
but I'd be interested in hearing specific examples.


I don't know about that.  *Everyone* wants output quickly.

Duncan Murdoch



FWIW, I've also heard the same documentation criticism leveled at R in
general (mostly from Stata and MATLAB users).

Kenny

On Wed, Jan 31, 2018, 10:12 AM Dirk Eddelbuettel  wrote:



Mehmet,

That is a loaded topic, not unlikely other topics preoccupying us these
days.

There is package.skeleton() in base R as you already mentioned. It drove me
bonkers that it creates packages which then fail R CMD check, so I wrote a
wrapper package (pkgKitten) with another helper function (kitten()) which
calls the base R helper and then cleans up it---but otherwise remains
faithful to it.

These days pkgKitten defaults to creating per-package top-level help page
that just references content from DESCRIPTION via a set of newer Rd macros
as
I find that helps keeping them aligned. The most recent example of mine is
   https://github.com/eddelbuettel/prrd/blob/master/man/prrd-package.Rd
I use either this function or the RStudio template helper all the time.

And similarly, other people written similar helpers. You may get other
pointers.

And every couple of months someone writes a new tutorial about how to
write a
first package. Then social media goes gaga and we get half a dozen blog
posts
where someone celebrates finding said tutorial, reading it and following
through with a new package.

And many of us taught workshops on creating packages. There is a lot of
material out here, though lots of this material seems to be entirely
ignorant
of what came before it.

And there has been lots, including Fritz's tutorial from a decade ago:

 https://epub.ub.uni-muenchen.de/6175/  as well as on CRAN as
 https://cran.r-project.org/doc/contrib/Leisch-CreatingPackages.pdf

So I'd recommend you just experiment and set up your own helpers. After all
the rule still holds: Anything you do more than three times should be a
function, and every function should be in a package. So customize _your_
function to create your package.

Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.list method for by Objects

2018-01-30 Thread Michael Lawrence
I just meant that the minimal contract for as.list() appears to be that it
returns a VECSXP. To the user, we might say that is.list() will always
return TRUE. I'm not sure we can expect consistency across methods beyond
that, nor is it feasible at this point to match the semantics of the
methods package. It deals in "class space" while as.list() deals in
"typeof() space".

Michael

On Tue, Jan 30, 2018 at 3:47 PM, Hervé Pagès  wrote:

> On 01/30/2018 02:50 PM, Michael Lawrence wrote:
>
>> by() does not always return a list. In Gabe's example, it returns an
>> integer, thus it is coerced to a list. as.list() means that it should be a
>> VECSXP, not necessarily with "list" in the class attribute.
>>
>
> The documentation is not particularly clear about what as.list()
> means for list derivatives. IMO clarifications should stick to
> simple concepts and formulations like "is.list(x) is TRUE" or
> "x is a list or a list derivative" rather than "x is a VECSXP".
> Coercion is useful beyond the use case of implementing a .C entry
> point and calling as.numeric/as.list/etc... on its arguments.
>
> This is why I was hoping that we could maybe discuss the possibility
> of making the as.list() contract less vague than just "as.list()
> must return a list or a list derivative".
>
> Again, I think that 2 things weight quite a lot in that discussion:
>   1) as.list() returns an object of class "data.frame" on a
>  data.frame (strict coercion). If all what as.list() needed to
>  do was to return a VECSXP, then as.list.default() already does
>  this on a data.frame so why did someone bother adding an
>  as.list.data.frame method that does strict coercion?
>   2) The S4 coercion system based on as() does strict coercion by
>  default.
>
> H.
>
>
>> Michael
>>
>>
>> On Tue, Jan 30, 2018 at 2:41 PM, Hervé Pagès > > wrote:
>>
>> Hi Gabe,
>>
>> Interestingly the behavior of as.list() on by objects seem to
>> depend on the object itself:
>>
>>  > b1 <- by(1:2, 1:2, identity)
>>  > class(as.list(b1))
>> [1] "list"
>>
>>  > b2 <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
>>  > class(as.list(b2))
>> [1] "by"
>>
>> This is with R 3.4.3 and R devel (2017-12-11 r73889).
>>
>> H.
>>
>> On 01/30/2018 02:33 PM, Gabriel Becker wrote:
>>
>> Dario,
>>
>> What version of R are you using. In my mildly old 3.4.0
>> installation and in the version of Revel I have lying around
>> (also mildly old...)  I don't see the behavior I think you are
>> describing
>>
>>  > b = by(1:2, 1:2, identity)
>>
>>  > class(as.list(b))
>>
>>  [1] "list"
>>
>>  > sessionInfo()
>>
>>  R Under development (unstable) (2017-12-19 r73926)
>>
>>  Platform: x86_64-apple-darwin15.6.0 (64-bit)
>>
>>  Running under: OS X El Capitan 10.11.6
>>
>>
>>  Matrix products: default
>>
>>  BLAS:
>> /Users/beckerg4/local/Rdevel/R
>> .framework/Versions/3.5/Resources/lib/libRblas.dylib
>>
>>  LAPACK:
>> /Users/beckerg4/local/Rdevel/R
>> .framework/Versions/3.5/Resources/lib/libRlapack.dylib
>>
>>
>>  locale:
>>
>>  [1]
>> en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>>
>>  attached base packages:
>>
>>  [1] stats graphics  grDevices utils datasets
>>  methods   base
>>
>>
>>  loaded via a namespace (and not attached):
>>
>>  [1] compiler_3.5.0
>>
>>  >
>>
>>
>> As for by not having a class definition, no S3 class has an
>> explicit definition, so this is somewhat par for the course
>> here...
>>
>> did I misunderstand something?
>>
>>
>> ~G
>>
>> On Tue, Jan 30, 2018 at 2:24 PM, Hervé Pagès
>> mailto:hpa...@fredhutch.org>
>> >>
>> wrote:
>>
>>  I agree that it makes sense to expect as.list() to perform
>>  a "strict coercion" i.e. to return an object of class "list",
>>  *even* on a list derivative. That's what as( , "list") does
>>  by default:
>>
>> # on a data.frame object
>> as(data.frame(), "list")  # object of class "list"
>>   # (but strangely it drops the
>> names)
>>
>> # on a by object
>> x <- by(warpbreaks[, 1:2], warpbreaks[,"tension"],
>> summary)
>> as(x, "list")  # object of class "list"
>>
>>  More generally speaking as() is expected to perform "strict
>>  coercion" by default, unless called with 'strict=FALSE'.
>>
>>  That's also what as.list() does on a data.frame:
>>
>> as.list(data.frame())  # object of clas

Re: [Rd] as.list method for by Objects

2018-01-30 Thread Hervé Pagès

On 01/30/2018 02:50 PM, Michael Lawrence wrote:
by() does not always return a list. In Gabe's example, it returns an 
integer, thus it is coerced to a list. as.list() means that it should be 
a VECSXP, not necessarily with "list" in the class attribute.


The documentation is not particularly clear about what as.list()
means for list derivatives. IMO clarifications should stick to
simple concepts and formulations like "is.list(x) is TRUE" or
"x is a list or a list derivative" rather than "x is a VECSXP".
Coercion is useful beyond the use case of implementing a .C entry
point and calling as.numeric/as.list/etc... on its arguments.

This is why I was hoping that we could maybe discuss the possibility
of making the as.list() contract less vague than just "as.list()
must return a list or a list derivative".

Again, I think that 2 things weight quite a lot in that discussion:
  1) as.list() returns an object of class "data.frame" on a
 data.frame (strict coercion). If all what as.list() needed to
 do was to return a VECSXP, then as.list.default() already does
 this on a data.frame so why did someone bother adding an
 as.list.data.frame method that does strict coercion?
  2) The S4 coercion system based on as() does strict coercion by
 default.

H.



Michael

On Tue, Jan 30, 2018 at 2:41 PM, Hervé Pagès > wrote:


Hi Gabe,

Interestingly the behavior of as.list() on by objects seem to
depend on the object itself:

 > b1 <- by(1:2, 1:2, identity)
 > class(as.list(b1))
[1] "list"

 > b2 <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
 > class(as.list(b2))
[1] "by"

This is with R 3.4.3 and R devel (2017-12-11 r73889).

H.

On 01/30/2018 02:33 PM, Gabriel Becker wrote:

Dario,

What version of R are you using. In my mildly old 3.4.0
installation and in the version of Revel I have lying around
(also mildly old...)  I don't see the behavior I think you are
describing

     > b = by(1:2, 1:2, identity)

     > class(as.list(b))

     [1] "list"

     > sessionInfo()

     R Under development (unstable) (2017-12-19 r73926)

     Platform: x86_64-apple-darwin15.6.0 (64-bit)

     Running under: OS X El Capitan 10.11.6


     Matrix products: default

     BLAS:

/Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resources/lib/libRblas.dylib


     LAPACK:

/Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib



     locale:

     [1]
en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8


     attached base packages:

     [1] stats     graphics  grDevices utils     datasets 
methods   base



     loaded via a namespace (and not attached):

     [1] compiler_3.5.0

     >


As for by not having a class definition, no S3 class has an
explicit definition, so this is somewhat par for the course here...

did I misunderstand something?


~G

On Tue, Jan 30, 2018 at 2:24 PM, Hervé Pagès
mailto:hpa...@fredhutch.org>
>> wrote:

     I agree that it makes sense to expect as.list() to perform
     a "strict coercion" i.e. to return an object of class "list",
     *even* on a list derivative. That's what as( , "list") does
     by default:

        # on a data.frame object
        as(data.frame(), "list")  # object of class "list"
                                  # (but strangely it drops the
names)

        # on a by object
        x <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
        as(x, "list")  # object of class "list"

     More generally speaking as() is expected to perform "strict
     coercion" by default, unless called with 'strict=FALSE'.

     That's also what as.list() does on a data.frame:

        as.list(data.frame())  # object of class "list"

     FWIW as.numeric() also performs "strict coercion" on an integer
     vector:

        as.numeric(1:3)  # object of class "numeric"

     So an as.list.env method that does the same as as(x, "list")
     would bring a small touch of consistency in an otherwise
     quite inconsistent world of coercion methods(*).

     H.

     (*) as(data.frame(), "list", strict=FALSE) doesn't do what
you'd
          expect (just one of many examples)


     On 01/29/2018 05:00 PM, Dario Strbenac wrote:

         Good day,

         I'd like to suggest the addition of an as.list method
for a by
         object that actually returns a list of clas

Re: [Rd] Why R should never move to git

2018-01-30 Thread Suzen, Mehmet
Gabor, I was just pointing out options. I think it is more of a policy
decision than a technical one. For example, the very mailing list we
are using is run by ETH Zurich with Martin Maechler. But it can well
be run on google groups. Maybe this list should also move to google
groups, it is unlikely that Google would shut down google groups soon.

Best,
-m

On 31 January 2018 at 00:26, Gábor Csárdi  wrote:
> While this is a very hypothetical argument, you could at least explain
> _why_ you would think so.
>
> If you were thinking about the unlikely event of GitHub / GitLab
> closing business, that is _not_ such a big to any active project that
> is hosted there.
>
> Gabor
>
> On Tue, Jan 30, 2018 at 11:07 PM, Suzen, Mehmet  
> wrote:
>> This might be off topic, but if R-core development ever moves to git,
>> I think it would make sense to have its own git service hosted by a
>> university, rather than using
>> github or gitlab. It is possible via https://gogs.io/ project.
>>
>> Just for the record.
>>
>> Best,
>> -m
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Why R should never move to git

2018-01-30 Thread Gábor Csárdi
While this is a very hypothetical argument, you could at least explain
_why_ you would think so.

If you were thinking about the unlikely event of GitHub / GitLab
closing business, that is _not_ such a big to any active project that
is hosted there.

Gabor

On Tue, Jan 30, 2018 at 11:07 PM, Suzen, Mehmet  wrote:
> This might be off topic, but if R-core development ever moves to git,
> I think it would make sense to have its own git service hosted by a
> university, rather than using
> github or gitlab. It is possible via https://gogs.io/ project.
>
> Just for the record.
>
> Best,
> -m
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Why R should never move to git

2018-01-30 Thread Suzen, Mehmet
This might be off topic, but if R-core development ever moves to git,
I think it would make sense to have its own git service hosted by a
university, rather than using
github or gitlab. It is possible via https://gogs.io/ project.

Just for the record.

Best,
-m

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Suzen, Mehmet
On 30 January 2018 at 21:31, Cook, Malcolm  wrote:
>
> I think you want to see the approach to generating a skeleton from a single 
> .R file presented in:
>
> Simple and sustainable R packaging using inlinedocs 
> http://inlinedocs.r-forge.r-project.org/
>
> I have not used it in some time but found it invaluable when I did.

For the record, the package has a JSS article as well:

https://www.jstatsoft.org/article/view/v054i06


Best,
-m

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.list method for by Objects

2018-01-30 Thread Michael Lawrence
by() does not always return a list. In Gabe's example, it returns an
integer, thus it is coerced to a list. as.list() means that it should be a
VECSXP, not necessarily with "list" in the class attribute.

Michael

On Tue, Jan 30, 2018 at 2:41 PM, Hervé Pagès  wrote:

> Hi Gabe,
>
> Interestingly the behavior of as.list() on by objects seem to
> depend on the object itself:
>
> > b1 <- by(1:2, 1:2, identity)
> > class(as.list(b1))
> [1] "list"
>
> > b2 <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
> > class(as.list(b2))
> [1] "by"
>
> This is with R 3.4.3 and R devel (2017-12-11 r73889).
>
> H.
>
> On 01/30/2018 02:33 PM, Gabriel Becker wrote:
>
>> Dario,
>>
>> What version of R are you using. In my mildly old 3.4.0 installation and
>> in the version of Revel I have lying around (also mildly old...)  I don't
>> see the behavior I think you are describing
>>
>> > b = by(1:2, 1:2, identity)
>>
>> > class(as.list(b))
>>
>> [1] "list"
>>
>> > sessionInfo()
>>
>> R Under development (unstable) (2017-12-19 r73926)
>>
>> Platform: x86_64-apple-darwin15.6.0 (64-bit)
>>
>> Running under: OS X El Capitan 10.11.6
>>
>>
>> Matrix products: default
>>
>> BLAS:
>> /Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resour
>> ces/lib/libRblas.dylib
>>
>> LAPACK:
>> /Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resour
>> ces/lib/libRlapack.dylib
>>
>>
>> locale:
>>
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>>
>> attached base packages:
>>
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>>
>> loaded via a namespace (and not attached):
>>
>> [1] compiler_3.5.0
>>
>> >
>>
>>
>> As for by not having a class definition, no S3 class has an explicit
>> definition, so this is somewhat par for the course here...
>>
>> did I misunderstand something?
>>
>>
>> ~G
>>
>> On Tue, Jan 30, 2018 at 2:24 PM, Hervé Pagès > > wrote:
>>
>> I agree that it makes sense to expect as.list() to perform
>> a "strict coercion" i.e. to return an object of class "list",
>> *even* on a list derivative. That's what as( , "list") does
>> by default:
>>
>># on a data.frame object
>>as(data.frame(), "list")  # object of class "list"
>>  # (but strangely it drops the names)
>>
>># on a by object
>>x <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
>>as(x, "list")  # object of class "list"
>>
>> More generally speaking as() is expected to perform "strict
>> coercion" by default, unless called with 'strict=FALSE'.
>>
>> That's also what as.list() does on a data.frame:
>>
>>as.list(data.frame())  # object of class "list"
>>
>> FWIW as.numeric() also performs "strict coercion" on an integer
>> vector:
>>
>>as.numeric(1:3)  # object of class "numeric"
>>
>> So an as.list.env method that does the same as as(x, "list")
>> would bring a small touch of consistency in an otherwise
>> quite inconsistent world of coercion methods(*).
>>
>> H.
>>
>> (*) as(data.frame(), "list", strict=FALSE) doesn't do what you'd
>>  expect (just one of many examples)
>>
>>
>> On 01/29/2018 05:00 PM, Dario Strbenac wrote:
>>
>> Good day,
>>
>> I'd like to suggest the addition of an as.list method for a by
>> object that actually returns a list of class "list". This would
>> make it safer to do type-checking, because is.list also returns
>> TRUE for a data.frame variable and using class(result) == "list"
>> is an alternative that only returns TRUE for lists. It's also
>> confusing initially that
>>
>> class(x)
>>
>> [1] "by"
>>
>> is.list(x)
>>
>> [1] TRUE
>>
>> since there's no explicit class definition for "by" and no
>> mention if it has any superclasses.
>>
>> --
>> Dario Strbenac
>> University of Sydney
>> Camperdown NSW 2050
>> Australia
>>
>> __
>> R-devel@r-project.org  mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
>> hz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84V
>> tBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=8nXbM
>> rKus1XsG7MluCRy3sluJKKhMVwOPHtudDpYJ4o&s=qDnEZOWalov3E9h1daj
>> p8RLURfRz0-nbwH721jFAcEo&e=
>> > ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAf
>> qt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=
>> 8nXbMrKus1XsG7MluCRy3sluJKKhMVwOPHtudDpYJ4o&s=qDnEZOWalov3E9
>> h1dajp8RLURfRz0-nbwH721jFAcEo&e=>
>>
>>
>> -- Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Heal

Re: [Rd] as.list method for by Objects

2018-01-30 Thread Hervé Pagès

Hi Gabe,

Interestingly the behavior of as.list() on by objects seem to
depend on the object itself:

> b1 <- by(1:2, 1:2, identity)
> class(as.list(b1))
[1] "list"

> b2 <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
> class(as.list(b2))
[1] "by"

This is with R 3.4.3 and R devel (2017-12-11 r73889).

H.

On 01/30/2018 02:33 PM, Gabriel Becker wrote:

Dario,

What version of R are you using. In my mildly old 3.4.0 installation and 
in the version of Revel I have lying around (also mildly old...)  I 
don't see the behavior I think you are describing


> b = by(1:2, 1:2, identity)

> class(as.list(b))

[1] "list"

> sessionInfo()

R Under development (unstable) (2017-12-19 r73926)

Platform: x86_64-apple-darwin15.6.0 (64-bit)

Running under: OS X El Capitan 10.11.6


Matrix products: default

BLAS:

/Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resources/lib/libRblas.dylib

LAPACK:

/Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib


locale:

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8


attached base packages:

[1] stats     graphics  grDevices utils     datasets  methods   base


loaded via a namespace (and not attached):

[1] compiler_3.5.0

> 




As for by not having a class definition, no S3 class has an explicit 
definition, so this is somewhat par for the course here...


did I misunderstand something?


~G

On Tue, Jan 30, 2018 at 2:24 PM, Hervé Pagès > wrote:


I agree that it makes sense to expect as.list() to perform
a "strict coercion" i.e. to return an object of class "list",
*even* on a list derivative. That's what as( , "list") does
by default:

   # on a data.frame object
   as(data.frame(), "list")  # object of class "list"
                             # (but strangely it drops the names)

   # on a by object
   x <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
   as(x, "list")  # object of class "list"

More generally speaking as() is expected to perform "strict
coercion" by default, unless called with 'strict=FALSE'.

That's also what as.list() does on a data.frame:

   as.list(data.frame())  # object of class "list"

FWIW as.numeric() also performs "strict coercion" on an integer
vector:

   as.numeric(1:3)  # object of class "numeric"

So an as.list.env method that does the same as as(x, "list")
would bring a small touch of consistency in an otherwise
quite inconsistent world of coercion methods(*).

H.

(*) as(data.frame(), "list", strict=FALSE) doesn't do what you'd
     expect (just one of many examples)


On 01/29/2018 05:00 PM, Dario Strbenac wrote:

Good day,

I'd like to suggest the addition of an as.list method for a by
object that actually returns a list of class "list". This would
make it safer to do type-checking, because is.list also returns
TRUE for a data.frame variable and using class(result) == "list"
is an alternative that only returns TRUE for lists. It's also
confusing initially that

class(x)

[1] "by"

is.list(x)

[1] TRUE

since there's no explicit class definition for "by" and no
mention if it has any superclasses.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

__
R-devel@r-project.org  mailing list

https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=8nXbMrKus1XsG7MluCRy3sluJKKhMVwOPHtudDpYJ4o&s=qDnEZOWalov3E9h1dajp8RLURfRz0-nbwH721jFAcEo&e=




-- 
Hervé Pagès


Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org 
Phone: (206) 667-5791 
Fax: (206) 667-1319 


__
R-devel@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Re: [Rd] as.list method for by Objects

2018-01-30 Thread Gabriel Becker
Dario,

What version of R are you using. In my mildly old 3.4.0 installation and in
the version of Revel I have lying around (also mildly old...)  I don't see
the behavior I think you are describing

> b = by(1:2, 1:2, identity)

> class(as.list(b))

[1] "list"

> sessionInfo()

R Under development (unstable) (2017-12-19 r73926)

Platform: x86_64-apple-darwin15.6.0 (64-bit)

Running under: OS X El Capitan 10.11.6


Matrix products: default

BLAS:
/Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resources/lib/libRblas.dylib

LAPACK:
/Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib


locale:

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8


attached base packages:

[1] stats graphics  grDevices utils datasets  methods   base


loaded via a namespace (and not attached):

[1] compiler_3.5.0

>


As for by not having a class definition, no S3 class has an explicit
definition, so this is somewhat par for the course here...

did I misunderstand something?


~G

On Tue, Jan 30, 2018 at 2:24 PM, Hervé Pagès  wrote:

> I agree that it makes sense to expect as.list() to perform
> a "strict coercion" i.e. to return an object of class "list",
> *even* on a list derivative. That's what as( , "list") does
> by default:
>
>   # on a data.frame object
>   as(data.frame(), "list")  # object of class "list"
> # (but strangely it drops the names)
>
>   # on a by object
>   x <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
>   as(x, "list")  # object of class "list"
>
> More generally speaking as() is expected to perform "strict
> coercion" by default, unless called with 'strict=FALSE'.
>
> That's also what as.list() does on a data.frame:
>
>   as.list(data.frame())  # object of class "list"
>
> FWIW as.numeric() also performs "strict coercion" on an integer
> vector:
>
>   as.numeric(1:3)  # object of class "numeric"
>
> So an as.list.env method that does the same as as(x, "list")
> would bring a small touch of consistency in an otherwise
> quite inconsistent world of coercion methods(*).
>
> H.
>
> (*) as(data.frame(), "list", strict=FALSE) doesn't do what you'd
> expect (just one of many examples)
>
>
> On 01/29/2018 05:00 PM, Dario Strbenac wrote:
>
>> Good day,
>>
>> I'd like to suggest the addition of an as.list method for a by object
>> that actually returns a list of class "list". This would make it safer to
>> do type-checking, because is.list also returns TRUE for a data.frame
>> variable and using class(result) == "list" is an alternative that only
>> returns TRUE for lists. It's also confusing initially that
>>
>> class(x)
>>>
>> [1] "by"
>>
>>> is.list(x)
>>>
>> [1] TRUE
>>
>> since there's no explicit class definition for "by" and no mention if it
>> has any superclasses.
>>
>> --
>> Dario Strbenac
>> University of Sydney
>> Camperdown NSW 2050
>> Australia
>>
>> __
>> R-devel@r-project.org mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et
>> hz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84V
>> tBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=8nXbM
>> rKus1XsG7MluCRy3sluJKKhMVwOPHtudDpYJ4o&s=qDnEZOWalov3E9h1daj
>> p8RLURfRz0-nbwH721jFAcEo&e=
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Gabriel Becker, PhD
Scientist (Bioinformatics)
Genentech Research

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Suzen, Mehmet
Dear All,

Thank you for all valuable input and sorry for the off-topic for the
list. I will try R-pkg-devel for further related questions.   I was
actually after "one-go" auto-documentation in-line or out of comments
from a single file/environment in a similar spirit to
'package.skeleton or an extension of it. My take-home message or
summary from all responses do far.

* Regarding documentation;Duncan Murdoch's  wisdom "...to get good
stuff in the help page, you need just as much work as in writing the
.Rd file directly..". So there is no silver bullet in terms of
auto-documentation, I gather, especially for considering if one uses
more complex constructs, S4/S6 classes or Rcpp code behind.On the
other hand, roxgen2 being the most comprehensive solution.

* Lightweight solution to try out before moving to RStudio fully. I
will give a try Dirk's 'pkgKitten' and 'inlinedocs' Malcolm mentioned.

Interestingly, responses have reminded me Larry Wall's quote
(https://en.wikipedia.org/wiki/There%27s_more_than_one_way_to_do_it),
which I think really applies to R more than any language I encounter
so far, from different class systems to different time-series
representations, so richly democratised.

Many regards,
Mehmet


On 30 January 2018 at 17:00, Suzen, Mehmet  wrote:
> Dear R developers,
>
> I am wondering what are the best practices for developing an R
> package. I am aware of Hadley Wickham's best practice
> documentation/book (http://r-pkgs.had.co.nz/).  I recall a couple of
> years ago there were some tools for generating a package out of a
> single file, such as using package.skeleton, but no auto-generated
> documentation. Do you know a way to generate documentation and a
> package out of single R source file, or from an environment?
>
> Many thanks,
> Mehmet

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.list method for by Objects

2018-01-30 Thread Hervé Pagès

On 01/30/2018 02:24 PM, Hervé Pagès wrote:

I agree that it makes sense to expect as.list() to perform
a "strict coercion" i.e. to return an object of class "list",
*even* on a list derivative. That's what as( , "list") does
by default:

   # on a data.frame object
   as(data.frame(), "list")  # object of class "list"
     # (but strangely it drops the names)

   # on a by object
   x <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
   as(x, "list")  # object of class "list"

More generally speaking as() is expected to perform "strict
coercion" by default, unless called with 'strict=FALSE'.

That's also what as.list() does on a data.frame:

   as.list(data.frame())  # object of class "list"

FWIW as.numeric() also performs "strict coercion" on an integer
vector:

   as.numeric(1:3)  # object of class "numeric"

So an as.list.env method that does the same as as(x, "list")

^^^
oops, I meant as.list.by, sorry...

H.


would bring a small touch of consistency in an otherwise
quite inconsistent world of coercion methods(*).

H.

(*) as(data.frame(), "list", strict=FALSE) doesn't do what you'd
     expect (just one of many examples)


On 01/29/2018 05:00 PM, Dario Strbenac wrote:

Good day,

I'd like to suggest the addition of an as.list method for a by object 
that actually returns a list of class "list". This would make it safer 
to do type-checking, because is.list also returns TRUE for a 
data.frame variable and using class(result) == "list" is an 
alternative that only returns TRUE for lists. It's also confusing 
initially that



class(x)

[1] "by"

is.list(x)

[1] TRUE

since there's no explicit class definition for "by" and no mention if 
it has any superclasses.


--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=8nXbMrKus1XsG7MluCRy3sluJKKhMVwOPHtudDpYJ4o&s=qDnEZOWalov3E9h1dajp8RLURfRz0-nbwH721jFAcEo&e= 







--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.list method for by Objects

2018-01-30 Thread Hervé Pagès

I agree that it makes sense to expect as.list() to perform
a "strict coercion" i.e. to return an object of class "list",
*even* on a list derivative. That's what as( , "list") does
by default:

  # on a data.frame object
  as(data.frame(), "list")  # object of class "list"
# (but strangely it drops the names)

  # on a by object
  x <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
  as(x, "list")  # object of class "list"

More generally speaking as() is expected to perform "strict
coercion" by default, unless called with 'strict=FALSE'.

That's also what as.list() does on a data.frame:

  as.list(data.frame())  # object of class "list"

FWIW as.numeric() also performs "strict coercion" on an integer
vector:

  as.numeric(1:3)  # object of class "numeric"

So an as.list.env method that does the same as as(x, "list")
would bring a small touch of consistency in an otherwise
quite inconsistent world of coercion methods(*).

H.

(*) as(data.frame(), "list", strict=FALSE) doesn't do what you'd
expect (just one of many examples)


On 01/29/2018 05:00 PM, Dario Strbenac wrote:

Good day,

I'd like to suggest the addition of an as.list method for a by object that actually returns a list 
of class "list". This would make it safer to do type-checking, because is.list also 
returns TRUE for a data.frame variable and using class(result) == "list" is an 
alternative that only returns TRUE for lists. It's also confusing initially that


class(x)

[1] "by"

is.list(x)

[1] TRUE

since there's no explicit class definition for "by" and no mention if it has 
any superclasses.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=8nXbMrKus1XsG7MluCRy3sluJKKhMVwOPHtudDpYJ4o&s=qDnEZOWalov3E9h1dajp8RLURfRz0-nbwH721jFAcEo&e=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Kenny Bell
In response to Duncan regarding the use of roxygen2 from the point of view
of a current user, I believe the issue he brings up is one of correlation
rather than causation.

Writing my first piece of R documentation was made much easier by using
roxygen2, and it shallowed the learning curve substantially.

What Duncan may be observing is a general tendency of roxygen2 users to
write overly concise documentation. However, I believe this to be caused by
an omitted variable - likely the tendency of roxygen2 users to want outputs
quickly. I can't see anything in roxygen2 that might suggest a causal link
but I'd be interested in hearing specific examples.

FWIW, I've also heard the same documentation criticism leveled at R in
general (mostly from Stata and MATLAB users).

Kenny

On Wed, Jan 31, 2018, 10:12 AM Dirk Eddelbuettel  wrote:

>
> Mehmet,
>
> That is a loaded topic, not unlikely other topics preoccupying us these
> days.
>
> There is package.skeleton() in base R as you already mentioned. It drove me
> bonkers that it creates packages which then fail R CMD check, so I wrote a
> wrapper package (pkgKitten) with another helper function (kitten()) which
> calls the base R helper and then cleans up it---but otherwise remains
> faithful to it.
>
> These days pkgKitten defaults to creating per-package top-level help page
> that just references content from DESCRIPTION via a set of newer Rd macros
> as
> I find that helps keeping them aligned. The most recent example of mine is
>   https://github.com/eddelbuettel/prrd/blob/master/man/prrd-package.Rd
> I use either this function or the RStudio template helper all the time.
>
> And similarly, other people written similar helpers. You may get other
> pointers.
>
> And every couple of months someone writes a new tutorial about how to
> write a
> first package. Then social media goes gaga and we get half a dozen blog
> posts
> where someone celebrates finding said tutorial, reading it and following
> through with a new package.
>
> And many of us taught workshops on creating packages. There is a lot of
> material out here, though lots of this material seems to be entirely
> ignorant
> of what came before it.
>
> And there has been lots, including Fritz's tutorial from a decade ago:
>
> https://epub.ub.uni-muenchen.de/6175/  as well as on CRAN as
> https://cran.r-project.org/doc/contrib/Leisch-CreatingPackages.pdf
>
> So I'd recommend you just experiment and set up your own helpers. After all
> the rule still holds: Anything you do more than three times should be a
> function, and every function should be in a package. So customize _your_
> function to create your package.
>
> Dirk
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] sum() returns NA on a long *logical* vector when nb of TRUE values exceeds 2^31

2018-01-30 Thread Hervé Pagès

Hi Martin, Henrik,

Thanks for the follow up.

@Martin: I vote for 2) without *any* hesitation :-)

(and uniformity could be restored at some point in the
future by having prod(), rowSums(), colSums(), and others
align with the behavior of length() and sum())

Cheers,
H.


On 01/27/2018 03:06 AM, Martin Maechler wrote:

Henrik Bengtsson 
 on Thu, 25 Jan 2018 09:30:42 -0800 writes:


 > Just following up on this old thread since matrixStats 0.53.0 is now
 > out, which supports this use case:

 >> x <- rep(TRUE, times = 2^31)

 >> y <- sum(x)
 >> y
 > [1] NA
 > Warning message:
 > In sum(x) : integer overflow - use sum(as.numeric(.))

 >> y <- matrixStats::sum2(x, mode = "double")
 >> y
 > [1] 2147483648
 >> str(y)
 > num 2.15e+09

 > No coercion is taking place, so the memory overhead is zero:

 >> profmem::profmem(y <- matrixStats::sum2(x, mode = "double"))
 > Rprofmem memory profiling of:
 > y <- matrixStats::sum2(x, mode = "double")

 > Memory allocations:
 > bytes calls
 > total 0

 > /Henrik

Thank you, Henrik, for the reminder.

Back in June, I had mentioned to Hervé and R-devel that
'logical' should remain to be treated as 'integer' as in all
arithmetic in (S and) R. Hervé did mention the isum()
function in the C code which is relevant here .. which does have
a LONG INT counter already -- *but* if we consider that sum()
has '...' i.e. a conceptually arbitrary number of long vector
integer arguments that counter won't suffice even there.

Before talking about implementation / patch, I think we should
consider 2 possible goals of a change --- I agree the status quo
is not a real option

1) sum(x) for logical and integer x  would return a double
   in any case and overflow should not happen (unless for
   the case where the result would be larger the
   .Machine$double.max which I think will not be possible
   even with "arbitrary" nargs() of sum.

2) sum(x) for logical and integer x  should return an integer in
all cases there is no overflow, including returning
NA_integer_ in case of NAs.
If there would be an overflow it must be detected "in time"
and the result should be double.

The big advantage of 2) is that it is back compatible in 99.x %
of use cases, and another advantage that it may be a very small
bit more efficient.  Also, in the case of "counting" (logical),
it is nice to get an integer instead of double when we can --
entirely analogously to the behavior of length() which returns
integer whenever possible.

The advantage of 1) is uniformity.

We should (at least provisionally) decide between 1) and 2) and then go for 
that.
It could be that going for 1) may have bad
compatibility-consequences in package space, because indeed we
had documented sum() would be integer for logical and integer arguments.

I currently don't really have time to
{work on implementing + dealing with the consequences}
for either ..

Martin

 > On Fri, Jun 2, 2017 at 1:58 PM, Henrik Bengtsson
 >  wrote:
 >> I second this feature request (it's understandable that this and
 >> possibly other parts of the code was left behind / forgotten after the
 >> introduction of long vector).
 >>
 >> I think mean() avoids full copies, so in the meanwhile, you can work
 >> around this limitation using:
 >>
 >> countTRUE <- function(x, na.rm = FALSE) {
 >> nx <- length(x)
 >> if (nx < .Machine$integer.max) return(sum(x, na.rm = na.rm))
 >> nx * mean(x, na.rm = na.rm)
 >> }
 >>
 >> (not sure if one needs to worry about rounding errors, i.e. where n %% 
0 != 0)
 >>
 >> x <- rep(TRUE, times = .Machine$integer.max+1)
 >> object.size(x)
 >> ## 8589934632 bytes
 >>
 >> p <- profmem::profmem( n <- countTRUE(x) )
 >> str(n)
 >> ## num 2.15e+09
 >> print(n == .Machine$integer.max + 1)
 >> ## [1] TRUE
 >>
 >> print(p)
 >> ## Rprofmem memory profiling of:
 >> ## n <- countTRUE(x)
 >> ##
 >> ## Memory allocations:
 >> ##  bytes calls
 >> ## total 0
 >>
 >>
 >> FYI / related: I've just updated matrixStats::sum2() to support
 >> logicals (develop branch) and I'll also try to update
 >> matrixStats::count() to count beyond .Machine$integer.max.
 >>
 >> /Henrik
 >>
 >> On Fri, Jun 2, 2017 at 4:05 AM, Hervé Pagès  
wrote:
 >>> Hi,
 >>>
 >>> I have a long numeric vector 'xx' and I want to use sum() to count
 >>> the number of elements that satisfy some criteria like non-zero
 >>> values or values lower than a certain threshold etc...
 >>>
 >>> The problem is: sum() returns an NA (with a warning) if the count
 >>> is greater than 2^31. For example:
 >>>
 >>> > xx <- runif(3e9)
 >>> > sum(xx < 0.9)
 >>> [1] NA
 >>> Warning message:
 >>> In sum(xx < 0.9) : integer overflow - use sum(as.nu

[Rd] CRAN indices out of whack (for at least macOS)

2018-01-30 Thread Dirk Eddelbuettel

I have received three distinct (non-)bug reports where someone claimed a
recent package of mine was broken ... simply because the macOS binary was not
there.

Is there something wrong with the cronjob providing the indices? Why is it
pointing people to binaries that do not exist?

Concretely, file

  https://cloud.r-project.org/bin/macosx/el-capitan/contrib/3.4/PACKAGES

contains

  Package: digest
  Version: 0.6.15
  Title: Create Compact Hash Digests of R Objects
  Depends: R (>= 2.4.1)
  Suggests: knitr, rmarkdown
  Built: R 3.4.3; x86_64-apple-darwin15.6.0; 2018-01-29 05:21:06 UTC; unix
  Archs: digest.so.dSYM

yet the _same directory_ only has:

  digest_0.6.14.tgz 15-Jan-2018 21:36   157K

I presume this is a temporary accident.

We are all spoiled by you all providing such a wonderfully robust and
well-oiled service---so again big THANKS for that--but today something is out
of order.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Dirk Eddelbuettel

Mehmet,

That is a loaded topic, not unlikely other topics preoccupying us these days.

There is package.skeleton() in base R as you already mentioned. It drove me
bonkers that it creates packages which then fail R CMD check, so I wrote a
wrapper package (pkgKitten) with another helper function (kitten()) which
calls the base R helper and then cleans up it---but otherwise remains
faithful to it.

These days pkgKitten defaults to creating per-package top-level help page
that just references content from DESCRIPTION via a set of newer Rd macros as
I find that helps keeping them aligned. The most recent example of mine is
  https://github.com/eddelbuettel/prrd/blob/master/man/prrd-package.Rd 
I use either this function or the RStudio template helper all the time.

And similarly, other people written similar helpers. You may get other pointers.

And every couple of months someone writes a new tutorial about how to write a
first package. Then social media goes gaga and we get half a dozen blog posts
where someone celebrates finding said tutorial, reading it and following
through with a new package. 

And many of us taught workshops on creating packages. There is a lot of
material out here, though lots of this material seems to be entirely ignorant
of what came before it.

And there has been lots, including Fritz's tutorial from a decade ago:

https://epub.ub.uni-muenchen.de/6175/  as well as on CRAN as
https://cran.r-project.org/doc/contrib/Leisch-CreatingPackages.pdf

So I'd recommend you just experiment and set up your own helpers. After all
the rule still holds: Anything you do more than three times should be a
function, and every function should be in a package. So customize _your_
function to create your package.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Cook, Malcolm
> >> I am wondering what are the best practices for developing an R
 > >> package. I am aware of Hadley Wickham's best practice
 > >> documentation/book (http://r-pkgs.had.co.nz/).  I recall a couple of
 > >> years ago there were some tools for generating a package out of a
 > >> single file, such as using package.skeleton, but no auto-generated
 > >> documentation. Do you know a way to generate documentation and a
 > >> package out of single R source file, or from an environment?

I think you want to see the approach to generating a skeleton from a single .R 
file presented in:

Simple and sustainable R packaging using inlinedocs 
http://inlinedocs.r-forge.r-project.org/

I have not used it in some time but found it invaluable when I did.

I would be VERY INTERESTED to hear how others feel it has held up.

Joining conversation late,

~malcolm_c...@stowers.org

 > >
 > > Mehmet,
 > >
 > > This list is for development of the R language itself and closely
 > > related tools.  There is a separate list, R-pkg-devel, for development
 > > of packages.
 > >
 > > Since you're here, I'll try to answer your question.
 > >
 > > package.skeleton can create a package from all the R functions in a
 > > specified environment.  So if you load all the functions that you want
 > > in your new package into your R environment, then call
 > > package.skeleton, you'll have a starting point.
 > >
 > > At that point, I would probably recommend moving to RStudio, and using
 > > RStudio to generate markdown comments for roxygen for all your newly
 > > created function files.  Then you could finish off the documentation by
 > > writing it in these roxygen skeletons or copying and pasting from
 > > comments in your original code files.
 > 
 > I'd agree about moving to RStudio, but I think Roxygen is the wrong
 > approach for documentation.  package.skeleton() will have done the
 > boring mechanical part of setting up your .Rd files; all you have to do
 > is edit some content into them.  (Use prompt() to add a new file if you
 > add a new function later, don't run package.skeleton() again.)
 > 
 > This isn't the fashionable point of view, but I think it is easier to
 > get good documentation that way than using Roxygen.  (It's easier to get
 > bad documentation using Roxygen, but who wants that?)
 > 
 > The reason I think this is that good documentation requires work and
 > thought.  You need to think about the markup that will get your point
 > across, you need to think about putting together good examples, etc.
 > This is *harder* in Roxygen than if you are writing Rd files, because
 > Roxygen is a thin front end to produce Rd files from comments in your .R
 > files.  To get good stuff in the help page, you need just as much work
 > as in writing the .Rd file directly, but then you need to add another
 > layer on top to put in in a comment.  Most people don't bother.
 > 
 > I don't know any packages with what I'd consider to be good
 > documentation that use Roxygen.  It's just too easy to write minimal
 > documentation that passes checks, so Roxygen users don't keep refining it.
 > 
 > (There are plenty of examples of packages that write bad documentation
 > directly to .Rd as well.  I just don't know of examples of packages with
 > good documentation that use Roxygen.)
 > 
 > Based on my criticism last week of git and Github, I expect to be called
 > a grumpy old man for holding this point of view.  I'd actually like to
 > be proven wrong.  So to anyone who disagrees with me:  rather than just
 > calling me names, how about some examples of Roxygen-using packages
 > that
 > have good help pages with good explanations, and good examples in them?
 > 
 > Back to Mehmet's question:  I think Hadley's book is pretty good, and
 > I'd recommend most of it, just not the Roxygen part.
 > 
 > Duncan Murdoch
 > 
 > __
 > R-devel@r-project.org mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Duncan Murdoch

On 30/01/2018 11:29 AM, Brian G. Peterson wrote:

On Tue, 2018-01-30 at 17:00 +0100, Suzen, Mehmet wrote:

Dear R developers,

I am wondering what are the best practices for developing an R
package. I am aware of Hadley Wickham's best practice
documentation/book (http://r-pkgs.had.co.nz/).  I recall a couple of
years ago there were some tools for generating a package out of a
single file, such as using package.skeleton, but no auto-generated
documentation. Do you know a way to generate documentation and a
package out of single R source file, or from an environment?


Mehmet,

This list is for development of the R language itself and closely
related tools.  There is a separate list, R-pkg-devel, for development
of packages.

Since you're here, I'll try to answer your question.

package.skeleton can create a package from all the R functions in a
specified environment.  So if you load all the functions that you want
in your new package into your R environment, then call
package.skeleton, you'll have a starting point.

At that point, I would probably recommend moving to RStudio, and using
RStudio to generate markdown comments for roxygen for all your newly
created function files.  Then you could finish off the documentation by
writing it in these roxygen skeletons or copying and pasting from
comments in your original code files.


I'd agree about moving to RStudio, but I think Roxygen is the wrong 
approach for documentation.  package.skeleton() will have done the 
boring mechanical part of setting up your .Rd files; all you have to do 
is edit some content into them.  (Use prompt() to add a new file if you 
add a new function later, don't run package.skeleton() again.)


This isn't the fashionable point of view, but I think it is easier to 
get good documentation that way than using Roxygen.  (It's easier to get 
bad documentation using Roxygen, but who wants that?)


The reason I think this is that good documentation requires work and 
thought.  You need to think about the markup that will get your point 
across, you need to think about putting together good examples, etc.
This is *harder* in Roxygen than if you are writing Rd files, because 
Roxygen is a thin front end to produce Rd files from comments in your .R 
files.  To get good stuff in the help page, you need just as much work 
as in writing the .Rd file directly, but then you need to add another 
layer on top to put in in a comment.  Most people don't bother.


I don't know any packages with what I'd consider to be good 
documentation that use Roxygen.  It's just too easy to write minimal 
documentation that passes checks, so Roxygen users don't keep refining it.


(There are plenty of examples of packages that write bad documentation 
directly to .Rd as well.  I just don't know of examples of packages with 
good documentation that use Roxygen.)


Based on my criticism last week of git and Github, I expect to be called 
a grumpy old man for holding this point of view.  I'd actually like to 
be proven wrong.  So to anyone who disagrees with me:  rather than just 
calling me names, how about some examples of Roxygen-using packages that 
have good help pages with good explanations, and good examples in them?


Back to Mehmet's question:  I think Hadley's book is pretty good, and 
I'd recommend most of it, just not the Roxygen part.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.list method for by Objects

2018-01-30 Thread Michael Lawrence
I agree that it would make sense for the object to have c("by", "list") as
its class attribute, since the object is known to behave as a list.
However, it would may be too disruptive to make this change at this point.
Hard to predict.

Michael

On Mon, Jan 29, 2018 at 5:00 PM, Dario Strbenac 
wrote:

> Good day,
>
> I'd like to suggest the addition of an as.list method for a by object that
> actually returns a list of class "list". This would make it safer to do
> type-checking, because is.list also returns TRUE for a data.frame variable
> and using class(result) == "list" is an alternative that only returns TRUE
> for lists. It's also confusing initially that
>
> > class(x)
> [1] "by"
> > is.list(x)
> [1] TRUE
>
> since there's no explicit class definition for "by" and no mention if it
> has any superclasses.
>
> --
> Dario Strbenac
> University of Sydney
> Camperdown NSW 2050
> Australia
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Brian G. Peterson
On Tue, 2018-01-30 at 17:00 +0100, Suzen, Mehmet wrote:
> Dear R developers,
> 
> I am wondering what are the best practices for developing an R
> package. I am aware of Hadley Wickham's best practice
> documentation/book (http://r-pkgs.had.co.nz/).  I recall a couple of
> years ago there were some tools for generating a package out of a
> single file, such as using package.skeleton, but no auto-generated
> documentation. Do you know a way to generate documentation and a
> package out of single R source file, or from an environment?

Mehmet,

This list is for development of the R language itself and closely
related tools.  There is a separate list, R-pkg-devel, for development
of packages.  

Since you're here, I'll try to answer your question.

package.skeleton can create a package from all the R functions in a
specified environment.  So if you load all the functions that you want
in your new package into your R environment, then call
package.skeleton, you'll have a starting point.

At that point, I would probably recommend moving to RStudio, and using
RStudio to generate markdown comments for roxygen for all your newly
created function files.  Then you could finish off the documentation by
writing it in these roxygen skeletons or copying and pasting from
comments in your original code files.

Please address further discussion to the R-pkg-devel list.

Regards,

Brian

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Best practices in developing package: From a single file

2018-01-30 Thread Suzen, Mehmet
Dear R developers,

I am wondering what are the best practices for developing an R
package. I am aware of Hadley Wickham's best practice
documentation/book (http://r-pkgs.had.co.nz/).  I recall a couple of
years ago there were some tools for generating a package out of a
single file, such as using package.skeleton, but no auto-generated
documentation. Do you know a way to generate documentation and a
package out of single R source file, or from an environment?

Many thanks,
Mehmet

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] withTimeout bug, it does not work properly with nlme anymore

2018-01-30 Thread Martin Maechler
> Ramiro Barrantes 
> on Mon, 27 Nov 2017 21:02:52 + writes:

> Hello, I was relying on withTimeout (from R.utils) to help
> me stop nlme when it �hangs�.  However, recently this
> stopped working.  I am pasting a reproducible example
> below: withTimeout should stop nlme after 10 seconds but
> the code will generate data for which nlme does not
> converge (or takes too long) and withTimeout does not stop
> it.  I tried this both on a linux (64 bit, CentOS 7, R
> 3.4.1, nlme 3.1-131 R.util 2.6, and also with R 3.2.5) and
> mac (Sierra 10.13.1, R 3.4.2, same versions or nlme and
> R.utils).  It takes over R and I need to use brute-force
> to stop it.  As mentioned, this used to work and it is
> very helpful for the purposes of having a loop where nlme
> goes through many models.

> Thank you in advance for any help, Ramiro

Dear Ramiro,

as I thought you are reporting a bug  about  R.utils  withTimeout(),
I and maybe others have not reacted.

You've addressed this again in a non-public e-mail,
and indeed the underlying bug is really in nlme  which you do
mention implicitly.

I'm appending a version of your example that is not using R.utils
at all and reproducible hangs for me with R 3.4.3, R 3.4.3
patched and R-devel (and almost surely earlier versions of R
which I did not check.

Indeed, the call to nlme() "stalls" // "hangs" / "freezes" /
... R indeed, and cannot be terminated in a regular way, and, as
you, I do need "brute force" to stop it, killing the R process
too.

As the maintainer of the 'nlme'  *is* R-core,
we are asked to fix this, at least making it interruptable.

Still I should not take time for that for the next couple of
weeks as I should fulfill several other day jobs duties,
instead, and so will not promise anything here.

Tested (minimal) patches are welcome!

Here's a version of your script slightly simplified which
exhibits the problem and shows the problem indeed does not
happen in nlminb() -- which I wrongly assumed for a while --
but indeed in nlme's call to own .C() code.

I am looking into fixing this (making it interruptable // detect
the infinite loop).
My guess is that it only happens in degenerate cases like here.

Martin Maechler
ETH Zurich


## From: Ramiro Barrantes 
## To: "r-devel@r-project.org" 
## Subject: [Rd] withTimeout bug, it does not work properly with nlme anymore
## Date: Mon, 27 Nov 2017 21:02:52 +

## Hello,

## I was relying on withTimeout (from R.utils) to help me stop nlme when it
## �hangs�.  However, recently this stopped working.  I am pasting a
## reproducible example below: withTimeout should stop nlme after 10 seconds
## but the code will generate data for which nlme does not converge (or takes
## too long) and withTimeout does not stop it.  I tried this both on a linux
## (64 bit, CentOS 7, R 3.4.1, nlme 3.1-131 R.util 2.6, and also with R
## 3.2.5) and mac (Sierra 10.13.1, R 3.4.2, same versions or nlme and
## R.utils).  It takes over R and I need to use brute-force to stop it.  As
## mentioned, this used to work and it is very helpful for the purposes of
## having a loop where nlme goes through many models.

## Thank you in advance for any help,
## Ramiro

## ((Modifications by Martin Maechler)
dat <- data.frame(

x=c(3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3,3,3,3,3,3,3,3,3,3,3,3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86),

y=c(0.35,0.69,0.57,1.48,6.08,-0.34,0.53,1.66,0.02,4.4,8.42,3.3,2.32,-2.3,7.52,-2.12,3.41,-4.76,7.9,5.04,10.26,-1.42,7.85,-1.88,3.81,-2.59,4.32,5.7,1.18,
 
-1.74,1.81,6.16,4.2,-0.39,1.55,-1.4,1.76,-4.14,-2.36,-0.24,4.8,-7.07,1.34,1.98,0.86,-3.96,-0.61,2.68,-1.65,-2.06,3.67,-0.19,2.33,3.78,2.16,0.35,
 
-5.6,1.32,2.99,4.21,-0.9,4.32,-4.01,2.03,0.9,-0.74,-5.78,5.76,0.52,1.37,-0.9,-4.06,-0.49,-2.39,-2.67,-0.71,-0.4,2.55,0.97,1.96,8.13,-5.93,4.01,0.79,
 -5.61,0.29,4.92,-2.89,-3.24,-3.06,-0.23,0.71,0.75,4.6,1.35, -3.35),
f.block = rep(1:4, 24),
id= paste0("a", rep(c(2,1,3),each=4)))
str(dat)
## 'data.frame':96 obs. of  4 variables:
##  $ x  : num  3.69 3.69 3.69 3.69 3.69 3.69 3.69 3.69 3.69 3.69 ...
##  $ y  : num  0.35 0.69 0.57 1.48 6.08 -0.34 0.53 1.66 0.02 4.4 ...
##  $ f.block: num  1 2 3 4 1 2 3 4 1 2 ...
##  $ id : Factor w/ 3 levels "a1","a2","a3": 2 2 2 2 1 1 1 1 3 3 ...

table(dat$id) # 32 x 3 -- indeed the 2 factors are perfectly balanced:
xtabs(~id + f.block, data=dat)

## This is the version to directly trigger the bug
dd <- dat
set.seed(33)
dd$y <- dat$y + rnorm(nrow(dat), mean = 0, sd = 0.1)

library(nlme, lib = .Library) # <- get R's version not a newer one
cat(

[Rd] LCONS undefined for R_NO_REMAP

2018-01-30 Thread Jeroen Ooms
Rinternals.h has:

#define CONS(a, b) cons((a), (b))
#define LCONS(a, b) lcons((a), (b))

However these are undefined when we compile with -DR_NO_REMAP. Maybe
it's safer to define these using Rf_cons() and Rf_lcons() instead?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel