Re: [Rd] CRAN policies
One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27.03.2012 16:17, Paul Gilbert wrote: One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? --as-cran is modelled rather closely after the CRAN incoming checks. CRAN checks if a new version has a new version number. Of course, you can ignore its result if you do not want to submit. The idea of using --as-cran is to apply it before you actually submit. Some parts require network connection etc. Uwe Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley wrote: > CRAN has for some time had a policies page at > http://cran.r-project.org/web/packages/policies.html > and we would like to draw this to the attention of package maintainers. In > particular, please > > - always send a submission email to c...@r-project.org with the package > name and version on the subject line. Emails sent to individual members of > the team will result in delays at best. > > - run R CMD check --as-cran on the tarball before you submit it. Do > this with the latest version of R possible: definitely R 2.14.2, > preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are > able to give better diagnostics, e.g. for compiled code and especially > on Windows. They may also have extra checks for recently uncovered > problems.) > > Also, please note that CRAN has a very heavy workload (186 packages were > published last week) and to remain viable needs package maintainers to make > its life as easy as possible. > Regarding the part about "warnings or significant notes" in that page, its impossible to know which notes are significant and which ones are not significant except by trial and error. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 12-03-27 10:59 AM, Uwe Ligges wrote: On 27.03.2012 16:17, Paul Gilbert wrote: One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? --as-cran is modelled rather closely after the CRAN incoming checks. CRAN checks if a new version has a new version number. Of course, you can ignore its result if you do not want to submit. The idea of using --as-cran is to apply it before you actually submit. Some parts require network connection etc. Uwe Yes but, for example, will R-forge run checks with --as-cran, and thus give warnings for any package unchanged from the one on CRAN, or run without --as-cran, and thus not give a true indication of whether the package is good to submit? (No doubt R-forge will customise more, but I am trying to work out a strategy for my own automatic testing.) Paul Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27.03.2012 17:22, Paul Gilbert wrote: On 12-03-27 10:59 AM, Uwe Ligges wrote: On 27.03.2012 16:17, Paul Gilbert wrote: One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? --as-cran is modelled rather closely after the CRAN incoming checks. CRAN checks if a new version has a new version number. Of course, you can ignore its result if you do not want to submit. The idea of using --as-cran is to apply it before you actually submit. Some parts require network connection etc. Uwe Yes but, for example, will R-forge run checks with --as-cran, and thus give warnings for any package unchanged from the one on CRAN, or run without --as-cran, and thus not give a true indication of whether the package is good to submit? This is a question for the R-forge maintainer. I would not expect it runs checks --as-cran, but I do now know. Best, Uwe (No doubt R-forge will customise more, but I am trying to work out a strategy for my own automatic testing.) Paul Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27.03.2012 17:09, Gabor Grothendieck wrote: On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Regarding the part about "warnings or significant notes" in that page, its impossible to know which notes are significant and which ones are not significant except by trial and error. Right, it needs human inspection to identify false positives. We believe most package maintainers are able to see if he or she is hit by such a false positive. Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27/03/2012 15:17, Paul Gilbert wrote: One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? Yes. It is only recommended for use just before submission. It is not used by the CRAN daily checks, for example. All it does it set some environment variables that you can also set in ~/.R/check.Renviron, scripts ... and that is what the CRAN team do. We introduced --as-cran to make it easier to explain to submitters how to get the check results we reported [*]. As for what the set is, read 'R Internals' or the code (it will vary by R version). Given that we get several submissions per week with the same version number or name as a package already on CRAN, we do need submitters to run the 'incoming' check before submission. [*] Since answering several emails a day about why their results were different was taking up far too much time. Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
2012/3/27 Uwe Ligges : > > > On 27.03.2012 17:09, Gabor Grothendieck wrote: >> >> On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley >> wrote: >>> >>> CRAN has for some time had a policies page at >>> http://cran.r-project.org/web/packages/policies.html >>> and we would like to draw this to the attention of package maintainers. >>> In >>> particular, please >>> >>> - always send a submission email to c...@r-project.org with the package >>> name and version on the subject line. Emails sent to individual members >>> of >>> the team will result in delays at best. >>> >>> - run R CMD check --as-cran on the tarball before you submit it. Do >>> this with the latest version of R possible: definitely R 2.14.2, >>> preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are >>> able to give better diagnostics, e.g. for compiled code and especially >>> on Windows. They may also have extra checks for recently uncovered >>> problems.) >>> >>> Also, please note that CRAN has a very heavy workload (186 packages were >>> published last week) and to remain viable needs package maintainers to >>> make >>> its life as easy as possible. >>> >> >> Regarding the part about "warnings or significant notes" in that page, >> its impossible to know which notes are significant and which ones are >> not significant except by trial and error. > > > > Right, it needs human inspection to identify false positives. We believe > most package maintainers are able to see if he or she is hit by such a false > positive. The problem is that a note is generated and the note is correct. Its not a false positive. But that does not tell you whether its "significant" or not. There is no way to know. One can either try to remove all notes (which may not be feasible) or just upload it and by trial and error find out if its accepted or not. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
Is there a distinction as to NOTE vs. WARNING that is documented? I've always assumed (wrongly?) that NOTES weren't an issue with publishing on CRAN, but that they may change to WARNINGS at some point. Is the process by which this happens documented somewhere? Jeff On 3/27/12 11:09 AM, "Gabor Grothendieck" wrote: >2012/3/27 Uwe Ligges : >> >> >> On 27.03.2012 17:09, Gabor Grothendieck wrote: >>> >>> On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley >>> wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. >>> >>> Regarding the part about "warnings or significant notes" in that page, >>> its impossible to know which notes are significant and which ones are >>> not significant except by trial and error. >> >> >> >> Right, it needs human inspection to identify false positives. We believe >> most package maintainers are able to see if he or she is hit by such a >>false >> positive. > >The problem is that a note is generated and the note is correct. Its >not a false positive. But that does not tell you whether its >"significant" or not. There is no way to know. One can either try to >remove all notes (which may not be feasible) or just upload it and by >trial and error find out if its accepted or not. > >-- >Statistics & Software Consulting >GKX Group, GKX Associates Inc. >tel: 1-877-GKX-GROUP >email: ggrothendieck at gmail.com > >__ >R-devel@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27.03.2012 19:10, Jeffrey Ryan wrote: Is there a distinction as to NOTE vs. WARNING that is documented? I've always assumed (wrongly?) that NOTES weren't an issue with publishing on CRAN, but that they may change to WARNINGS at some point. We won't kick packages off CRAN for Notes (but we will if Warnings are not fixed), but we may not accept new submissions with significant Notes. Best, Uwe Ligges Is the process by which this happens documented somewhere? Jeff On 3/27/12 11:09 AM, "Gabor Grothendieck" wrote: 2012/3/27 Uwe Ligges: On 27.03.2012 17:09, Gabor Grothendieck wrote: On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Regarding the part about "warnings or significant notes" in that page, its impossible to know which notes are significant and which ones are not significant except by trial and error. Right, it needs human inspection to identify false positives. We believe most package maintainers are able to see if he or she is hit by such a false positive. The problem is that a note is generated and the note is correct. Its not a false positive. But that does not tell you whether its "significant" or not. There is no way to know. One can either try to remove all notes (which may not be feasible) or just upload it and by trial and error find out if its accepted or not. -- Statistics& Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
Thanks Uwe for the clarification on what goes and what stays. Still fuzzy on the notion of "significant" though. Do you have an example or two for the list? Jeff P.S. I meant to also thank all of CRAN volunteers for the momentous efforts involved, and it is nice to see some explanation of how we can help, as well as a peek into what goes on 'behind the curtain' ;-) On 3/27/12 1:19 PM, "Uwe Ligges" wrote: > > >On 27.03.2012 19:10, Jeffrey Ryan wrote: >> Is there a distinction as to NOTE vs. WARNING that is documented? I've >> always assumed (wrongly?) that NOTES weren't an issue with publishing on >> CRAN, but that they may change to WARNINGS at some point. > >We won't kick packages off CRAN for Notes (but we will if Warnings are >not fixed), but we may not accept new submissions with significant Notes. > >Best, >Uwe Ligges > > > >> Is the process by which this happens documented somewhere? >> >> Jeff >> >> On 3/27/12 11:09 AM, "Gabor Grothendieck" >>wrote: >> >>> 2012/3/27 Uwe Ligges: On 27.03.2012 17:09, Gabor Grothendieck wrote: > > On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley >wrote: >> >> CRAN has for some time had a policies page at >> http://cran.r-project.org/web/packages/policies.html >> and we would like to draw this to the attention of package >> maintainers. >> In >> particular, please >> >> - always send a submission email to c...@r-project.org with the >> package >> name and version on the subject line. Emails sent to individual >> members >> of >> the team will result in delays at best. >> >> - run R CMD check --as-cran on the tarball before you submit it. Do >> this with the latest version of R possible: definitely R 2.14.2, >> preferably R 2.15.0 RC or a recent R-devel. (Later versions of R >>are >> able to give better diagnostics, e.g. for compiled code and >>especially >> on Windows. They may also have extra checks for recently uncovered >> problems.) >> >> Also, please note that CRAN has a very heavy workload (186 packages >> were >> published last week) and to remain viable needs package maintainers >>to >> make >> its life as easy as possible. >> > > Regarding the part about "warnings or significant notes" in that >page, > its impossible to know which notes are significant and which ones are > not significant except by trial and error. Right, it needs human inspection to identify false positives. We believe most package maintainers are able to see if he or she is hit by such a false positive. >>> >>> The problem is that a note is generated and the note is correct. Its >>> not a false positive. But that does not tell you whether its >>> "significant" or not. There is no way to know. One can either try to >>> remove all notes (which may not be feasible) or just upload it and by >>> trial and error find out if its accepted or not. >>> >>> -- >>> Statistics& Software Consulting >>> GKX Group, GKX Associates Inc. >>> tel: 1-877-GKX-GROUP >>> email: ggrothendieck at gmail.com >>> >>> __ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
2012/3/27 Uwe Ligges : > > > On 27.03.2012 19:10, Jeffrey Ryan wrote: >> >> Is there a distinction as to NOTE vs. WARNING that is documented? I've >> always assumed (wrongly?) that NOTES weren't an issue with publishing on >> CRAN, but that they may change to WARNINGS at some point. > > > We won't kick packages off CRAN for Notes (but we will if Warnings are not > fixed), but we may not accept new submissions with significant Notes. Yes, I understand that but that does not really address the problem that one has no idea of whether a Note is significant or not so the only way to determine its significance is to submit your package and see if its accepted or not. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
An associated problem, for the wish list, is that it would be nice for package developers to have a way to automatically distinguish between NOTEs that can usually be ignored (e.g. a package suggests a package that is not available for cross reference checks - I have several case where the suggested package depends on the package being built, so this NOTE occurs all the time), and NOTEs that are really pre-WARNINGS, so that one can flag these and spend time fixing them before they become a WARNING or ERROR. Perhaps two different kinds of notes? (And, BTW, having been responsible for a certain amount of the >[*] Since answering several emails a day about why their >results were different was taking up far too much time. I think --as-cran is great.) Paul On 12-03-27 02:19 PM, Uwe Ligges wrote: On 27.03.2012 19:10, Jeffrey Ryan wrote: Is there a distinction as to NOTE vs. WARNING that is documented? I've always assumed (wrongly?) that NOTES weren't an issue with publishing on CRAN, but that they may change to WARNINGS at some point. We won't kick packages off CRAN for Notes (but we will if Warnings are not fixed), but we may not accept new submissions with significant Notes. Best, Uwe Ligges Is the process by which this happens documented somewhere? Jeff On 3/27/12 11:09 AM, "Gabor Grothendieck" wrote: 2012/3/27 Uwe Ligges: On 27.03.2012 17:09, Gabor Grothendieck wrote: On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Regarding the part about "warnings or significant notes" in that page, its impossible to know which notes are significant and which ones are not significant except by trial and error. Right, it needs human inspection to identify false positives. We believe most package maintainers are able to see if he or she is hit by such a false positive. The problem is that a note is generated and the note is correct. Its not a false positive. But that does not tell you whether its "significant" or not. There is no way to know. One can either try to remove all notes (which may not be feasible) or just upload it and by trial and error find out if its accepted or not. -- Statistics& Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
I have been wondering if it is possible to automate the checking process to reduce human efforts, e.g. automatically check the packages submitted to FTP, and send the package maintainer an email in case of warnings or errors (otherwise just move it to CRAN); package maintainers can appeal for a manual check by CRAN maintainers in case of false positives. As a package author, I really hate to bother CRAN maintainers each time I upload a new version and it passes R CMD check successfully, in which case I should have received an automatic email instead of Kurt's "hand-writing" "thanks, on CRAN now". Frankly speaking, it makes me feel guilty sometimes to update my packages, thinking of other 3700 packages on CRAN and how much time you CRAN maintainers are spending on checking the packages. I do not know how many package authors actually read this mailing list, so these policies may not really reach some authors at all. Regards, Yihui -- Yihui Xie Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
> I have been wondering if it is possible to automate the checking > process to reduce human efforts, e.g. automatically check the packages > submitted to FTP, and send the package maintainer an email in case of > warnings or errors (otherwise just move it to CRAN); package > maintainers can appeal for a manual check by CRAN maintainers in case > of false positives. I've started using win-builder before submitting to CRAN. This often picks up problems that I don't see locally. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On Tue, Mar 27, 2012 at 6:52 AM, Prof Brian Ripley wrote: > CRAN has for some time had a policies page at > http://cran.r-project.org/web/packages/policies.html > and we would like to draw this to the attention of package maintainers. In > particular, please Thanks for the pointer - I did not know that this page existed. In general, is there some easy way to track changes to this page and the R extension manual over time? It is difficult to keep track of the best practices. I'd also like to get clarification on "Packages should not write in the users' home filespace, nor anywhere else on the file system apart from the R session's temporary directory (or during installation in the location pointed to by TMPDIR: and such usage should be cleaned up)." - what is recommended practice for packages to maintain state across instances? Operating systems have standards for where applications can store settings (e.g. as described in http://pypi.python.org/pypi/appdirs/1.2.0). Is it acceptable to for packages to follow these conventions? Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
Lots of very sensible policies here. I have one request as someone who has in several cases had to involve company lawyers over intellectual property issues with packages on CRAN -- the first bullet point on ownership of copyright and intellectual property rights could be strengthened further. To the existing text "The ownership of copyright and intellectual property rights of all components of the package must be clear and unambiguous (including from the authors specification in the DESCRIPTION file). Where code is copied (or derived) from the work of others (including from R itself), care must be taken that any copyright statements are preserved and authorship is not misrepresented. Trademarks must be respected." I would add a few additional points : 1. The text of the license itself should be included in the package in a LICENSE or COPYING file, as most of these licenses have things that need to be filled in with names and other data, and just referencing a license name in the DESCRIPTION file is not really a great way to deal with licensing metadata when used exclusively (it's a great complement to a full, filled-out license in the package itself). 2. Per file copyright comment headers can help immensely with ensuring compliance and the accidental incorporation of files under a different license. Comment header blocks with the author name and terms of distribution could be recommended for all source files. - Murray On Tue, Mar 27, 2012 at 4:52 AM, Prof Brian Ripley wrote: > CRAN has for some time had a policies page at > http://cran.r-project.org/web/packages/policies.html > and we would like to draw this to the attention of package maintainers. In > particular, please > > - always send a submission email to c...@r-project.org with the package > name and version on the subject line. Emails sent to individual members of > the team will result in delays at best. > > - run R CMD check --as-cran on the tarball before you submit it. Do > this with the latest version of R possible: definitely R 2.14.2, > preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are > able to give better diagnostics, e.g. for compiled code and especially > on Windows. They may also have extra checks for recently uncovered > problems.) > > Also, please note that CRAN has a very heavy workload (186 packages were > published last week) and to remain viable needs package maintainers to make > its life as easy as possible. > > Kurt Hornik > Uwe Ligges > Brian Ripley > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
> From: x...@yihui.name > Date: Tue, 27 Mar 2012 16:40:04 -0500 > To: r-devel@r-project.org > Subject: Re: [Rd] CRAN policies > > I have been wondering if it is possible to automate the checking > process to reduce human efforts, e.g. automatically check the packages > submitted to FTP, and send the package maintainer an email in case of > warnings or errors (otherwise just move it to CRAN); package > maintainers can appeal for a manual check by CRAN maintainers in case > of false positives. As a package author, I really hate to bother CRAN > maintainers each time I upload a new version and it passes R CMD check > successfully, in which case I should have received an automatic email > instead of Kurt's "hand-writing" "thanks, on CRAN now". Frankly > speaking, it makes me feel guilty sometimes to update my packages, > thinking of other 3700 packages on CRAN and how much time you CRAN > maintainers are spending on checking the packages. > Indeed it is a good summary of how I felt for so long and in particular my recent experience, which involved Kurt, Brian, and Uwe. I think win-builder certainly helps, but it is feasible with a Linux counterpart "to have a final say"? > I do not know how many package authors actually read this mailing > list, so these policies may not really reach some authors at all. > Certainly more colleagues read the list than have been revealed by the postings. Kind regards, Jing Hua > Regards, > Yihui > -- > Yihui Xie > Phone: 515-294-2465 Web: http://yihui.name > Department of Statistics, Iowa State University > 2215 Snedecor Hall, Ames, IA > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 28.03.2012 00:07, Hadley Wickham wrote: On Tue, Mar 27, 2012 at 6:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please Thanks for the pointer - I did not know that this page existed. In general, is there some easy way to track changes to this page and the R extension manual over time? It is difficult to keep track of the best practices. I'd also like to get clarification on "Packages should not write in the users' home filespace, nor anywhere else on the file system apart from the R session's temporary directory (or during installation in the location pointed to by TMPDIR: and such usage should be cleaned up)." - what is recommended practice for packages to maintain state across instances? Operating systems have standards for where applications can store settings (e.g. as described in http://pypi.python.org/pypi/appdirs/1.2.0). Is it acceptable to for packages to follow these conventions? The policy is meant not to overwrite user data or generate loads of temporary files from examples and pollute, e.g., the owkring directory. Uwe Ligges Hadley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27.03.2012 20:33, Jeffrey Ryan wrote: Thanks Uwe for the clarification on what goes and what stays. Still fuzzy on the notion of "significant" though. Do you have an example or two for the list? We have to look at those notes again and again in order to find if something important is noted, hence please always try to avoid all notes unless the effect is really intended! Consider the Note "No visible binding for global variable" We cannot know if your code intends to use such a global variable (which is undesirable in most cases), hence would let is pass if it seems to be sensible. Another Note such as "empty section" or "partial argument match" can quickly be fixed, hence just do it and don't waste our time. Best, Uwe Ligges Jeff P.S. I meant to also thank all of CRAN volunteers for the momentous efforts involved, and it is nice to see some explanation of how we can help, as well as a peek into what goes on 'behind the curtain' ;-) On 3/27/12 1:19 PM, "Uwe Ligges" wrote: On 27.03.2012 19:10, Jeffrey Ryan wrote: Is there a distinction as to NOTE vs. WARNING that is documented? I've always assumed (wrongly?) that NOTES weren't an issue with publishing on CRAN, but that they may change to WARNINGS at some point. We won't kick packages off CRAN for Notes (but we will if Warnings are not fixed), but we may not accept new submissions with significant Notes. Best, Uwe Ligges Is the process by which this happens documented somewhere? Jeff On 3/27/12 11:09 AM, "Gabor Grothendieck" wrote: 2012/3/27 Uwe Ligges: On 27.03.2012 17:09, Gabor Grothendieck wrote: On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Regarding the part about "warnings or significant notes" in that page, its impossible to know which notes are significant and which ones are not significant except by trial and error. Right, it needs human inspection to identify false positives. We believe most package maintainers are able to see if he or she is hit by such a false positive. The problem is that a note is generated and the note is correct. Its not a false positive. But that does not tell you whether its "significant" or not. There is no way to know. One can either try to remove all notes (which may not be feasible) or just upload it and by trial and error find out if its accepted or not. -- Statistics& Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27.03.2012 20:36, Gabor Grothendieck wrote: 2012/3/27 Uwe Ligges: On 27.03.2012 19:10, Jeffrey Ryan wrote: Is there a distinction as to NOTE vs. WARNING that is documented? I've always assumed (wrongly?) that NOTES weren't an issue with publishing on CRAN, but that they may change to WARNINGS at some point. We won't kick packages off CRAN for Notes (but we will if Warnings are not fixed), but we may not accept new submissions with significant Notes. Yes, I understand that but that does not really address the problem that one has no idea of whether a Note is significant or not so the only way to determine its significance is to submit your package and see if its accepted or not. We have to look at those notes again and again in order to find if something important is noted, hence please always try to avoid all notes unless the effect is really intended! Consider the Note "No visible binding for global variable" We cannot know if your code intends to use such a global variable (which is undesirable in most cases), hence would let is pass if it seems to be sensible. Another Note such as "empty section" or "partial argument match" can quickly be fixed, hence just do it and don't waste our time. Best, Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
2012/3/28 Uwe Ligges : > > > On 27.03.2012 20:33, Jeffrey Ryan wrote: >> >> Thanks Uwe for the clarification on what goes and what stays. >> >> Still fuzzy on the notion of "significant" though. Do you have an example >> or two for the list? > > > > We have to look at those notes again and again in order to find if something > important is noted, hence please always try to avoid all notes unless the > effect is really intended! > > > Consider the Note "No visible binding for global variable" > We cannot know if your code intends to use such a global variable (which is > undesirable in most cases), hence would let is pass if it seems to be > sensible. > > Another Note such as "empty section" or "partial argument match" can quickly > be fixed, hence just do it and don't waste our time. > > Best, > Uwe Ligges What is the point of notes vs warnings if you have to get rid of both of them? Furthermore, if there are notes that you don't have to get rid of its not fair that package developers should have to waste their time on things that are actually acceptable. Finally, it makes the whole system arbitrary since packages can be rejected based on undefined rules. Either divide notes into significant notes and ordinary notes and clearly label them as such in the output of R CMD check or else make the significant notes warnings so one can know in advance whether the package passes R CMD check or not. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 28.03.2012 16:30, Gabor Grothendieck wrote: 2012/3/28 Uwe Ligges: On 27.03.2012 20:33, Jeffrey Ryan wrote: Thanks Uwe for the clarification on what goes and what stays. Still fuzzy on the notion of "significant" though. Do you have an example or two for the list? We have to look at those notes again and again in order to find if something important is noted, hence please always try to avoid all notes unless the effect is really intended! Consider the Note "No visible binding for global variable" We cannot know if your code intends to use such a global variable (which is undesirable in most cases), hence would let is pass if it seems to be sensible. Another Note such as "empty section" or "partial argument match" can quickly be fixed, hence just do it and don't waste our time. Best, Uwe Ligges What is the point of notes vs warnings if you have to get rid of both of them? Furthermore, if there are notes that you don't have to get rid of its not fair that package developers should have to waste their time on things that are actually acceptable. Finally, it makes the whole system arbitrary since packages can be rejected based on undefined rules. Either divide notes into significant notes and ordinary notes and clearly label them as such in the output of R CMD check or else make the significant notes warnings so one can know in advance whether the package passes R CMD check or not. I tried to make clear that we cannot decide that automatically and it needs human inspection and thinking if some Note is significant or not. That why we have not made them Warnings where we are sure things have to be fixed. Please always try to avoid all notes unless the effect is really intended! How hard can it be? If Notes could be completely ignored, they would not be Notes. Uwe __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On Thu, Mar 29, 2012 at 3:30 AM, Gabor Grothendieck wrote: > 2012/3/28 Uwe Ligges : >> >> >> On 27.03.2012 20:33, Jeffrey Ryan wrote: >>> >>> Thanks Uwe for the clarification on what goes and what stays. >>> >>> Still fuzzy on the notion of "significant" though. Do you have an example >>> or two for the list? >> >> >> >> We have to look at those notes again and again in order to find if something >> important is noted, hence please always try to avoid all notes unless the >> effect is really intended! >> >> >> Consider the Note "No visible binding for global variable" >> We cannot know if your code intends to use such a global variable (which is >> undesirable in most cases), hence would let is pass if it seems to be >> sensible. >> >> Another Note such as "empty section" or "partial argument match" can quickly >> be fixed, hence just do it and don't waste our time. >> >> Best, >> Uwe Ligges > > What is the point of notes vs warnings if you have to get rid of both > of them? Furthermore, if there are notes that you don't have to get > rid of its not fair that package developers should have to waste their > time on things that are actually acceptable. Finally, it makes the > whole system arbitrary since packages can be rejected based on > undefined rules. > The "notes" are precisely the things for which clear rules can't be written. They are reported by CMD check because they are usually signs of coding errors, but are not warnings because their use is sometimes justified. The 'No visible binding for global variable" is a good example. This found some bugs in my 'survey' package, which I removed. There is still one note of this type, which arises when I have to handle two different versions of the hexbin package with different internal structures. The note is a false positive because the use is guarded by an if(), but CMD check can't tell this. So, it's a good idea to remove all Notes that can be removed without introducing other code problems, which is nearly all of them, but occasionally there may be a good reason for code that produces a Note. But if you want a simple, unambiguous, mechanical rule for *your* packages, just eliminate all Notes. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On Wed, Mar 28, 2012 at 11:52 PM, Thomas Lumley wrote: > On Thu, Mar 29, 2012 at 3:30 AM, Gabor Grothendieck > wrote: >> 2012/3/28 Uwe Ligges : >>> >>> >>> On 27.03.2012 20:33, Jeffrey Ryan wrote: Thanks Uwe for the clarification on what goes and what stays. Still fuzzy on the notion of "significant" though. Do you have an example or two for the list? >>> >>> >>> >>> We have to look at those notes again and again in order to find if something >>> important is noted, hence please always try to avoid all notes unless the >>> effect is really intended! >>> >>> >>> Consider the Note "No visible binding for global variable" >>> We cannot know if your code intends to use such a global variable (which is >>> undesirable in most cases), hence would let is pass if it seems to be >>> sensible. >>> >>> Another Note such as "empty section" or "partial argument match" can quickly >>> be fixed, hence just do it and don't waste our time. >>> >>> Best, >>> Uwe Ligges >> >> What is the point of notes vs warnings if you have to get rid of both >> of them? Furthermore, if there are notes that you don't have to get >> rid of its not fair that package developers should have to waste their >> time on things that are actually acceptable. Finally, it makes the >> whole system arbitrary since packages can be rejected based on >> undefined rules. >> > > The "notes" are precisely the things for which clear rules can't be > written. They are reported by CMD check because they are usually > signs of coding errors, but are not warnings because their use is > sometimes justified. > > The 'No visible binding for global variable" is a good example. This > found some bugs in my 'survey' package, which I removed. There is > still one note of this type, which arises when I have to handle two > different versions of the hexbin package with different internal > structures. The note is a false positive because the use is guarded > by an if(), but CMD check can't tell this. So, it's a good idea to > remove all Notes that can be removed without introducing other code > problems, which is nearly all of them, but occasionally there may be a > good reason for code that produces a Note. > > But if you want a simple, unambiguous, mechanical rule for *your* > packages, just eliminate all Notes. I think it would be more objective and also easiest for everyone if notes were accepted. It might be that over time some notes could be split into multiple cases some of which are warnings and others continue to be notes. That way package developers don't have to waste their time on getting rid of notes which don't matter and the CRAN maintainers can turn the task of reviewing notes over to the computer. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On Thu, 2012-03-29 at 16:52 +1300, Thomas Lumley wrote: > The 'No visible binding for global variable" is a good example. This > found some bugs in my 'survey' package, which I removed. There is > still one note of this type, which arises when I have to handle two > different versions of the hexbin package with different internal > structures. The note is a false positive because the use is guarded > by an if(), but CMD check can't tell this. So, it's a good idea to > remove all Notes that can be removed without introducing other code > problems, which is nearly all of them, but occasionally there may be a > good reason for code that produces a Note. > 'occasionally' seems like an understatement. Here's an example: data(cars) lm(speed ~ dist,cars) #would produce global variables NOTE lm("speed ~ dist",cars) # would not produce the NOTE While the change required to avoid the CRAN NOTE is small, I can't think of a single example or text on using formulas that recommends quoting the formula as a best practice. I'm not sure how users or package authors are supposed to know that they should use a (non standard) way of specifying the formula to avoid wasting their time, and the CRAN volunteers time. I'm certain that there are many other examples, but this one was easy to demonstrate. Regards, - Brian -- Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On Mar 29, 2012, at 14:58 , Brian G. Peterson wrote: > On Thu, 2012-03-29 at 16:52 +1300, Thomas Lumley wrote: >> The 'No visible binding for global variable" is a good example. This >> found some bugs in my 'survey' package, which I removed. There is >> still one note of this type, which arises when I have to handle two >> different versions of the hexbin package with different internal >> structures. The note is a false positive because the use is guarded >> by an if(), but CMD check can't tell this. So, it's a good idea to >> remove all Notes that can be removed without introducing other code >> problems, which is nearly all of them, but occasionally there may be a >> good reason for code that produces a Note. >> > 'occasionally' seems like an understatement. > > Here's an example: > > data(cars) > lm(speed ~ dist,cars) #would produce global variables NOTE > lm("speed ~ dist",cars) # would not produce the NOTE Context, please. Where does this happen? (and why do you need data(cars)?) I find it hard to believe that quoting the formula should be the solution to this issue. There must be tons of examples to the contrary. > > While the change required to avoid the CRAN NOTE is small, I can't think > of a single example or text on using formulas that recommends quoting > the formula as a best practice. I'm not sure how users or package > authors are supposed to know that they should use a (non standard) way > of specifying the formula to avoid wasting their time, and the CRAN > volunteers time. I'm certain that there are many other examples, but > this one was easy to demonstrate. > > Regards, > > - Brian > > -- > Brian G. Peterson > http://braverock.com/brian/ > Ph: 773-459-4973 > IM: bgpbraverock > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 03/29/2012 05:00 AM, r-devel-requ...@r-project.org wrote: The 'No visible binding for global variable" is a good example. This found some bugs in my 'survey' package, which I removed. There is still one note of this type, which arises when I have to handle two different versions of the hexbin package with different internal structures. The note is a false positive because the use is guarded by an if(), but CMD check can't tell this. So, it's a good idea to remove all Notes that can be removed without introducing other code problems, which is nearly all of them, but occasionally there may be a good reason for code that produces a Note. The survival package has a similar special case: the routines for expected population survival are set up to accept multiple types of date format so have lines like if (class(x) == 'chron') { y <- as.numeric(x - chron("01/01/1960")} This leaves me with two extraneous "no visible binding" messages. There used to be half a dozen but I've tried to remove as many as possible, for all the good reasons already articulated by the maintainers. It still remains that 99/100 of the "no visible binding" messages I've seen over the years were misspelled variable names, and the message is a very welcome check. Terry Therneau __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 29 March 2012 at 07:58, Brian G. Peterson wrote: | On Thu, 2012-03-29 at 16:52 +1300, Thomas Lumley wrote: | > The 'No visible binding for global variable" is a good example. This | > found some bugs in my 'survey' package, which I removed. There is | > still one note of this type, which arises when I have to handle two | > different versions of the hexbin package with different internal | > structures. The note is a false positive because the use is guarded | > by an if(), but CMD check can't tell this. So, it's a good idea to | > remove all Notes that can be removed without introducing other code | > problems, which is nearly all of them, but occasionally there may be a | > good reason for code that produces a Note. | > | 'occasionally' seems like an understatement. | | Here's an example: | | data(cars) | lm(speed ~ dist,cars) #would produce global variables NOTE | lm("speed ~ dist",cars) # would not produce the NOTE | | While the change required to avoid the CRAN NOTE is small, I can't think | of a single example or text on using formulas that recommends quoting | the formula as a best practice. I'm not sure how users or package | authors are supposed to know that they should use a (non standard) way | of specifying the formula to avoid wasting their time, and the CRAN | volunteers time. I'm certain that there are many other examples, but | this one was easy to demonstrate. And it's close to my personal favourite of with( cars, ... some expression involving dist and / or speed ... ) which gives the same warning about dist and speed being unknown globals. Punishment for good coding style -- gotta love it. Now, we all want high-quality packages. We all strive to have as few false positives. And we all understand that writing a parser if freaking hard. One fudge-y way of helping with this may be via an overrides file. This is what Debian does to suppress known / tolerated violations of what the 'lintian' package checker picks up on. For the R package, I have a fair number of these: the file for the r-base-core binary is currently 83 lines long and this ends on r-base-core: executable-not-elf-or-script usr/lib/R/bin/Rdiff r-base-core: image-file-in-usr-lib usr/lib/R/library/graphics/help/figures/mai.png r-base-core: image-file-in-usr-lib usr/lib/R/library/graphics/help/figures/oma.png r-base-core: image-file-in-usr-lib usr/lib/R/library/graphics/help/figures/pch.png r-base-core: executable-not-elf-or-script usr/lib/R/bin/Rd2pdf two warnings on files with 755 modes in a non-PATH location (fine, that's how R works) and idem with image files below /usr/lib (when the FHS probably prefers them below /usr/share/). You pipe the output of a lintian run into 'lintian-info' and you get longer one or two paragraph descriptions with further pointers on the violations. Does this sounds like something worthwhile to add to the R CMD check system ? Should we consider to allow overrides to make known good exceptions good away? Dirk -- R/Finance 2012 Conference on May 11 and 12, 2012 at UIC in Chicago, IL See agenda, registration details and more at http://www.RinFinance.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 3/29/2012 7:07 AM, Dirk Eddelbuettel wrote: On 29 March 2012 at 07:58, Brian G. Peterson wrote: | On Thu, 2012-03-29 at 16:52 +1300, Thomas Lumley wrote: |> The 'No visible binding for global variable" is a good example. This |> found some bugs in my 'survey' package, which I removed. There is |> still one note of this type, which arises when I have to handle two |> different versions of the hexbin package with different internal |> structures. The note is a false positive because the use is guarded |> by an if(), but CMD check can't tell this. So, it's a good idea to |> remove all Notes that can be removed without introducing other code |> problems, which is nearly all of them, but occasionally there may be a |> good reason for code that produces a Note. |> | 'occasionally' seems like an understatement. | | Here's an example: | | data(cars) | lm(speed ~ dist,cars) #would produce global variables NOTE | lm("speed ~ dist",cars) # would not produce the NOTE Another example using library(ggplot2): =qplot(time., value, data=X, geom='line', facets=facets, color=variable, xlim=xlim, ylim=ylim, xlab='days', ylab='displacement (inches)', ...), "value" and "variable" are columns of "X". If I knew how to list this in an "overrides" file, I would do so. My experience is similar to what others mentioned: 99 percent of the "No visible bindings" messages I've seen are my coding errors. This one is not. I don't recall for sure, but I think I checked trying putting "value" and "variable" in quotes, and it didn't work. The function that includes this call to "qplot" actually includes the definition of a global variable "time.", which is NOT used, because "X" has a column named "time.". The global variable "time." is a character string, while "X$time." is class POSIXct. I mention this, because this discussion suddenly told me how to get rid of this NOTE: Precede this call to qplot with something like the following: value <- variable <- "NOTE: Define these variables to override the NOTE impulse in R CMD check' I haven't tried this with "qplot", but it ignores the global variable "Time." and uses the "Time." column of "X", so it should work. I just tried something similar with "lm", and it ignored a global variable in favor of a column of "X". This is a silly kludge, but it's simple and does not require a modification to "R CMD check". Spencer | | While the change required to avoid the CRAN NOTE is small, I can't think | of a single example or text on using formulas that recommends quoting | the formula as a best practice. I'm not sure how users or package | authors are supposed to know that they should use a (non standard) way | of specifying the formula to avoid wasting their time, and the CRAN | volunteers time. I'm certain that there are many other examples, but | this one was easy to demonstrate. And it's close to my personal favourite of with( cars, ... some expression involving dist and / or speed ... ) which gives the same warning about dist and speed being unknown globals. Punishment for good coding style -- gotta love it. Now, we all want high-quality packages. We all strive to have as few false positives. And we all understand that writing a parser if freaking hard. One fudge-y way of helping with this may be via an overrides file. This is what Debian does to suppress known / tolerated violations of what the 'lintian' package checker picks up on. For the R package, I have a fair number of these: the file for the r-base-core binary is currently 83 lines long and this ends on r-base-core: executable-not-elf-or-script usr/lib/R/bin/Rdiff r-base-core: image-file-in-usr-lib usr/lib/R/library/graphics/help/figures/mai.png r-base-core: image-file-in-usr-lib usr/lib/R/library/graphics/help/figures/oma.png r-base-core: image-file-in-usr-lib usr/lib/R/library/graphics/help/figures/pch.png r-base-core: executable-not-elf-or-script usr/lib/R/bin/Rd2pdf two warnings on files with 755 modes in a non-PATH location (fine, that's how R works) and idem with image files below /usr/lib (when the FHS probably prefers them below /usr/share/). You pipe the output of a lintian run into 'lintian-info' and you get longer one or two paragraph descriptions with further pointers on the violations. Does this sounds like something worthwhile to add to the R CMD check system ? Should we consider to allow overrides to make known good exceptions good away? Dirk __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
> -Original Message- > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On > Behalf > Of Terry Therneau > Sent: Thursday, March 29, 2012 7:02 AM > To: r-devel@r-project.org > Subject: Re: [Rd] CRAN policies > > On 03/29/2012 05:00 AM, r-devel-requ...@r-project.org wrote: > > The 'No visible binding for global variable" is a good example. This > > found some bugs in my 'survey' package, which I removed. There is > > still one note of this type, which arises when I have to handle two > > different versions of the hexbin package with different internal > > structures. The note is a false positive because the use is guarded > > by an if(), but CMD check can't tell this. So, it's a good idea to > > remove all Notes that can be removed without introducing other code > > problems, which is nearly all of them, but occasionally there may be a > > good reason for code that produces a Note. > The survival package has a similar special case: the routines for > expected population survival are set up to accept multiple types of date > format so have lines like > if (class(x) == 'chron') { y <- as.numeric(x - chron("01/01/1960")} > This leaves me with two extraneous "no visible binding" messages. Suppose we defined a function like NO_VISIBLE_BINDING(expr) expr and added an entry to the stuff in codetools so that it would not check for misspelled object names in call to NO_VISIBLE_BINDING. Then Terry could write that line as if (class(x) == "chron") { y <- as.numeric(x - NO_VISIBLE_BINDING(chron)("01/01/1960")} and the Notes would disappear. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > There > used to be half a dozen but I've tried to remove as many as possible, > for all the good reasons already articulated by the maintainers. > > It still remains that 99/100 of the "no visible binding" messages I've > seen over the years were misspelled variable names, and the message is a > very welcome check. > > Terry Therneau > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
William Dunlap tibco.com> writes: > > -Original Message- > > The survival package has a similar special case: the routines for > > expected population survival are set up to accept multiple types of date > > format so have lines like > > if (class(x) == 'chron') { y <- as.numeric(x - chron("01/01/1960")} > > This leaves me with two extraneous "no visible binding" messages. > > Suppose we defined a function like > NO_VISIBLE_BINDING(expr) expr > and added an entry to the stuff in codetools so that it > would not check for misspelled object names in call to > NO_VISIBLE_BINDING. Then Terry could write that line as > if (class(x) == "chron") { y <- as.numeric(x - NO_VISIBLE_BINDING(chron) ("01/01/1960")} > and the Notes would disappear. > That's ok for package code, but what about test suites? Say there was a test on the result of "with(DF,a+b)", you wouldn't want to change the test to "with (DF,NO_VISIBLE_BINDING(a)+NO_VISIBLE_BINDING(b))" not just because that's long and onerous, but because that's *changing* the test i.e. introducing a difference between what's tested and what user code will do. As others suggested, how about a new category: MEMO. The "no visible binding" NOTE would be downgraded to MEMO. CRAN maintainers could then ignore MEMOs more easily. What I really like about NOTES is that when new checks are added to R then as a package maintainer you know you don't have to fix them straight away. If a new WARNING shows up on r-devel daily checks, however, then you've got some warning about the WARNING that you need to fix more urgently and may even accelerate a release. So it's not just about checks when submitting a package, but what happens afterwards as R itself (and packages in Depends) move on. In other words, you know you need to fix new NOTES but not as urgently as new WARNINGS. Matthew __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On > Behalf > Of Matthew Dowle > Sent: Thursday, March 29, 2012 10:41 AM > To: r-de...@stat.math.ethz.ch > Subject: Re: [Rd] CRAN policies > > William Dunlap tibco.com> writes: > > > > -Original Message- > > > The survival package has a similar special case: the routines for > > > expected population survival are set up to accept multiple types of date > > > format so have lines like > > > if (class(x) == 'chron') { y <- as.numeric(x - chron("01/01/1960")} > > > This leaves me with two extraneous "no visible binding" messages. > > > > Suppose we defined a function like > > NO_VISIBLE_BINDING(expr) expr > > and added an entry to the stuff in codetools so that it > > would not check for misspelled object names in call to > > NO_VISIBLE_BINDING. Then Terry could write that line as > > if (class(x) == "chron") { y <- as.numeric(x - > > NO_VISIBLE_BINDING(chron) > ("01/01/1960")} > > and the Notes would disappear. > > > > That's ok for package code, but what about test suites? Say there was a test > on the result of "with(DF,a+b)", you wouldn't want to change the test to "with > (DF,NO_VISIBLE_BINDING(a)+NO_VISIBLE_BINDING(b))" not just because that's long > and onerous, but because that's *changing* the test i.e. introducing a > difference between what's tested and what user code will do. I don't know if test suites need to be checked for no visible bindings - if there is a real problem the test ought to fail. codetools should be able to do special checks for known functions that do not following the standard evaluation rules . E.g., do not check any arguments of `~`, do not check the 'expr' argument of with, do not check the subset or weights arguments of lm. If a package writer introduces a new function with nonstandard evaluation, perhaps the package could include some information about the matter in a file that codetools could could source before running its checks. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > As others suggested, how about a new category: MEMO. The "no visible binding" > NOTE would be downgraded to MEMO. CRAN maintainers could then ignore MEMOs > more > easily. > > What I really like about NOTES is that when new checks are added to R then as > a > package maintainer you know you don't have to fix them straight away. If a new > WARNING shows up on r-devel daily checks, however, then you've got some > warning > about the WARNING that you need to fix more urgently and may even accelerate a > release. So it's not just about checks when submitting a package, but what > happens afterwards as R itself (and packages in Depends) move on. In other > words, you know you need to fix new NOTES but not as urgently as new WARNINGS. > > Matthew > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 3/29/2012 11:29 AM, William Dunlap wrote: Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Matthew Dowle Sent: Thursday, March 29, 2012 10:41 AM To: r-de...@stat.math.ethz.ch Subject: Re: [Rd] CRAN policies William Dunlap tibco.com> writes: -Original Message- The survival package has a similar special case: the routines for expected population survival are set up to accept multiple types of date format so have lines like if (class(x) == 'chron') { y<- as.numeric(x - chron("01/01/1960")} This leaves me with two extraneous "no visible binding" messages. Suppose we defined a function like NO_VISIBLE_BINDING(expr) expr and added an entry to the stuff in codetools so that it would not check for misspelled object names in call to NO_VISIBLE_BINDING. Then Terry could write that line as if (class(x) == "chron") { y<- as.numeric(x - NO_VISIBLE_BINDING(chron) ("01/01/1960")} and the Notes would disappear. That's ok for package code, but what about test suites? Say there was a test on the result of "with(DF,a+b)", you wouldn't want to change the test to "with (DF,NO_VISIBLE_BINDING(a)+NO_VISIBLE_BINDING(b))" not just because that's long and onerous, but because that's *changing* the test i.e. introducing a difference between what's tested and what user code will do. I don't know if test suites need to be checked for no visible bindings - if there is a real problem the test ought to fail. codetools should be able to do special checks for known functions that do not following the standard evaluation rules . E.g., do not check any arguments of `~`, do not check the 'expr' argument of with, do not check the subset or weights arguments of lm. If a package writer introduces a new function with nonstandard evaluation, perhaps the package could include some information about the matter in a file that codetools could could source before running its checks. This gets my vote -- but I don't have the bandwidth nor authority to effect the change ;-) Spencer Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com As others suggested, how about a new category: MEMO. The "no visible binding" NOTE would be downgraded to MEMO. CRAN maintainers could then ignore MEMOs more easily. What I really like about NOTES is that when new checks are added to R then as a package maintainer you know you don't have to fix them straight away. If a new WARNING shows up on r-devel daily checks, however, then you've got some warning about the WARNING that you need to fix more urgently and may even accelerate a release. So it's not just about checks when submitting a package, but what happens afterwards as R itself (and packages in Depends) move on. In other words, you know you need to fix new NOTES but not as urgently as new WARNINGS. Matthew __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
> > codetools should be able to do special checks for known functions that > > do not following the standard evaluation rules . E.g., do not check any > > arguments of `~`, do not check the 'expr' argument of with, do not check > > the subset or weights arguments of lm. > > > > If a package writer introduces a new function with nonstandard evaluation, > > perhaps the package could include some information about the matter > > in a file that codetools could could source before running its checks. > > >This gets my vote -- but I don't have the bandwidth nor authority > to effect the change ;-) Spencer Most of that stuff is already in codetools, at least when it is checking functions with checkUsage(). E.g., arguments of ~ are not checked. The expr argument to with() will not be checked if you add skipWith=FALSE to the call to checkUsage. > library(codetools) > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~ Pred})) : no visible binding for global variable 'Num' (:1) : no visible binding for global variable 'Den' (:1) > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~ Pred}), skipWith=TRUE) > checkUsage(function(dataFrame) with(DataFrame, {Num/Den ; Resp ~ Pred}), skipWith=TRUE) : no visible binding for global variable 'DataFrame' The only part that I don't see is the mechanism to add code-walker functions to the environment in codetools that has the standard list of them for functions with nonstandard evaluation: > objects(codetools:::collectUsageHandlers, all=TRUE) [1] "$" "$<-" ".Internal" [4] "::"":::" "@" [7] "@<-" "{" "~" [10] "<-""<<-" "=" [13] "assign""binomial" "bquote" [16] "data" "detach""expression" [19] "for" "function" "Gamma" [22] "gaussian" "if""library" [25] "local" "poisson" "quasi" [28] "quasibinomial" "quasipoisson" "quote" [31] "Quote" "require" "substitute" [34] "with" Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: Spencer Graves [mailto:spencer.gra...@prodsyse.com] > Sent: Thursday, March 29, 2012 12:22 PM > To: William Dunlap > Cc: Matthew Dowle; r-de...@stat.math.ethz.ch > Subject: Re: [Rd] CRAN policies > > On 3/29/2012 11:29 AM, William Dunlap wrote: > > > > Bill Dunlap > > Spotfire, TIBCO Software > > wdunlap tibco.com > > > > > >> -Original Message- > >> From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] > >> On > Behalf > >> Of Matthew Dowle > >> Sent: Thursday, March 29, 2012 10:41 AM > >> To: r-de...@stat.math.ethz.ch > >> Subject: Re: [Rd] CRAN policies > >> > >> William Dunlap tibco.com> writes: > >> > >>>> -Original Message- > >>>> The survival package has a similar special case: the routines for > >>>> expected population survival are set up to accept multiple types of date > >>>> format so have lines like > >>>> if (class(x) == 'chron') { y<- as.numeric(x - chron("01/01/1960")} > >>>> This leaves me with two extraneous "no visible binding" messages. > >>> Suppose we defined a function like > >>>NO_VISIBLE_BINDING(expr) expr > >>> and added an entry to the stuff in codetools so that it > >>> would not check for misspelled object names in call to > >>> NO_VISIBLE_BINDING. Then Terry could write that line as > >>> if (class(x) == "chron") { y<- as.numeric(x - > >>> NO_VISIBLE_BINDING(chron) > >> ("01/01/1960")} > >>> and the Notes would disappear. > >>> > >> That's ok for package code, but what about test suites? Say there was a > >> test > >> on the result of "with(DF,a+b)", you wouldn't want to change the test to > >> "with > >> (DF,NO_VISIBLE_BINDING(a)+NO_VISIBLE_BINDING(b))" not just because that's > >> long > >&
Re: [Rd] CRAN policies
> Most of that stuff is already in codetools, at least when it is checking > functions > with checkUsage(). E.g., arguments of ~ are not checked. The expr argument > to with() will not be checked if you add skipWith=FALSE to the call to > checkUsage. > > > library(codetools) > > > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~ Pred})) > : no visible binding for global variable 'Num' (:1) > : no visible binding for global variable 'Den' (:1) > > > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~ Pred}), > skipWith=TRUE) > > > checkUsage(function(dataFrame) with(DataFrame, {Num/Den ; Resp ~ Pred}), > skipWith=TRUE) > : no visible binding for global variable 'DataFrame' > > The only part that I don't see is the mechanism to add code-walker functions > to > the environment in codetools that has the standard list of them for functions > with > nonstandard evaluation: > > objects(codetools:::collectUsageHandlers, all=TRUE) > [1] "$" "$<-" ".Internal" > [4] "::" ":::" "@" > [7] "@<-" "{" "~" > [10] "<-" "<<-" "=" > [13] "assign" "binomial" "bquote" > [16] "data" "detach" "expression" > [19] "for" "function" "Gamma" > [22] "gaussian" "if" "library" > [25] "local" "poisson" "quasi" > [28] "quasibinomial" "quasipoisson" "quote" > [31] "Quote" "require" "substitute" > [34] "with" It seems like we really need a standard way to add metadata to functions: attr(with, "special_args") <- "expr" attr(lm, "special_args") <- c("formula", "weights", "subset") This would be useful because it could automatically contribute to the documentation. Similarly, attr(my.new.method, "s3method") <- c("my.new", "method") could be useful. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
I'm concerned this thread is heading the wrong way, towards techno-fixes for imaginary problems. R package-building is already encumbered with a huge set of complicated rules, and more instructions/rules eg for metadata would make things worse not better. RCMD CHECK on the 'mvbutils' package generates over 300 Notes about "no visible binding...", which inevitably I just ignore. They arise because RCMD CHECK is too "stupid" to understand one of my preferred coding idioms (I'm not going to explain what-- that's beside the point). And RCMD CHECK always will be too "stupid" to understand everything that a rich language like R might quite reasonably cause experienced coders to do. It should not be CRAN's business how I write my code, or even whether my code does what it is supposed to. It might be CRAN's business to try to work out whether my code breaks CRAN's policies, eg by causing R to crash horribly-- that's presumably what Warnings are for (but see below). And maybe there could be circumstances where an automatic check might be "worried" enough to alert the CRANia and require manual explanation and emails etc from a developer, but even that seems doomed given the growing deluge of packages. RCMD CHECK currently functions both as a "sanitizer" for CRAN, and as a developer-tool. But the fact that the one programl does both things seems accidental to me, and I think this dual-use is muddying the discussion. There's a big distinction between (i) code-checks that developers themselves might or might not find useful-- which should be left to the developer, and will vary from person to person-- and (ii) code-checks that CRAN enforces for its own peace-of-mind. Maybe it's convenient to have both functions in the same place, and it'd be fine to use Notes for one and Warnings for the other, but the different purposes should surely be kept clear. Personally, in building over 10 packages (only 2 on CRAN), I haven't found RCMD CHECK to be of any use, except for the code-documentation and example-running bits. I know other people have different opinions, but that's the point: one-size-does-not-fit-all when it comes to coding tools. And wrto the Warnings themselves: I feel compelled to point out that it's logically impossible to fully check whether R code will do bad things. One has to wonder at what point adding new checks becomes futile or counterproductive. There must be over 2000 people who have written CRAN packages by now; every extra check and non-back-compatible additional requirement runs the risk of generating false-negatives and incurring many extra person-hours to "fix" non-problems. Plus someone needs to document and explain the check (adding to the rule mountain), plus there is the time spent in discussions like this..! Mark Mark Bravington CSIRO CMIS Marine Lab Hobart Australia From: r-devel-boun...@r-project.org [r-devel-boun...@r-project.org] On Behalf Of Hadley Wickham [had...@rice.edu] Sent: 30 March 2012 07:42 To: William Dunlap Cc: r-de...@stat.math.ethz.ch; Spencer Graves Subject: Re: [Rd] CRAN policies > Most of that stuff is already in codetools, at least when it is checking > functions > with checkUsage(). E.g., arguments of ~ are not checked. The expr argument > to with() will not be checked if you add skipWith=FALSE to the call to > checkUsage. > > > library(codetools) > > > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~ Pred})) > : no visible binding for global variable 'Num' (:1) > : no visible binding for global variable 'Den' (:1) > > > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~ Pred}), > skipWith=TRUE) > > > checkUsage(function(dataFrame) with(DataFrame, {Num/Den ; Resp ~ Pred}), > skipWith=TRUE) > : no visible binding for global variable 'DataFrame' > > The only part that I don't see is the mechanism to add code-walker functions > to > the environment in codetools that has the standard list of them for functions > with > nonstandard evaluation: > > objects(codetools:::collectUsageHandlers, all=TRUE) > [1] "$" "$<-" ".Internal" > [4] "::"":::" "@" > [7] "@<-" "{" "~" > [10] "<-""<<-" "=" > [13] "assign""binomial" "bquote" > [16] "data" "detach""expression" > [19] "for" "function" "Gamma" > [22] "g
Re: [Rd] CRAN policies
On 12-03-29 09:29 PM, mark.braving...@csiro.au wrote: > I'm concerned this thread is heading the wrong way, towards > techno-fixes for imaginary problems. R package-building is already > encumbered with a huge set of complicated rules, and more > instructions/rules eg for metadata would make things worse not better. > > RCMD CHECK on the 'mvbutils' package generates over 300 Notes about > "no visible binding...", which inevitably I just ignore. They arise > because RCMD CHECK is too "stupid" to understand one of my preferred > coding idioms (I'm not going to explain what-- that's beside the > point). Actually, I think that is the point. If your code is generating that many notes then I think you should explain your idiom, so the checks can be made to accommodate it if it really is good. Otherwise, I'd be worried about the quality of your code. > And RCMD CHECK always will be too "stupid" to understand everything > that a rich language like R might quite reasonably cause experienced > coders to do. Possibly the interpreter is too stupid to understand it too? > It should not be CRAN's business how I write my code, or even whether > my code does what it is supposed to. It might be CRAN's business to > try to work out whether my code breaks CRAN's policies, eg by causing > R to crash horribly-- that's presumably what Warnings are for (but > see below). And maybe there could be circumstances where an automatic > check might be "worried" enough to alert the CRANia and require manual > explanation and emails etc from a developer, but even that seems > doomed given the growing deluge of packages. > > RCMD CHECK currently functions both as a "sanitizer" for CRAN, and as > a developer-tool. But the fact that the one programl does both things > seems accidental to me, and I think this dual-use is muddying the > discussion. There's a big distinction between (i) code-checks that > developers themselves might or might not find useful-- which should > be left to the developer, and will vary from person to person-- I think this a case of two heads are better than one. I did lots of checks before the CRAN checks existed, but the CRAN checks still found bugs in code that I considerer very mature, including bugs in code has been running without noticeable problems for over 15 years. Despite all the noise today, most of us are only talking about a small inconvenience around the intended meaning of "note", not about whether quality control is a bad thing. I've found the errors and warnings are always valid, even though I do not always like having to fix the bugs, and the notes are most often valid too. But there are a few false positives, so the checks that give notes are not yet reliable enough to give warnings or errors. But they should be sometime, so one should usually consider fixing the package code. > and (ii) code-checks that CRAN enforces for its own peace-of-mind. I think of this as being for the piece-of-mind of your package users. > Maybe it's convenient to have both functions in the same place, and > it'd be fine to use Notes for one and Warnings for the other, but the > different purposes should surely be kept clear. > > Personally, in building over 10 packages (only 2 on CRAN), I haven't > found RCMD CHECK to be of any use, except for the code-documentation > and example-running bits. I know other people have different > opinions, but that's the point: one-size-does-not-fit-all when it > comes to coding tools. > > And wrto the Warnings themselves: I feel compelled to point out that > it's logically impossible to fully check whether R code will do bad > things. One has to wonder at what point adding new checks becomes > futile or counterproductive. There must be over 2000 people who have > written CRAN packages by now; every extra check and non-back- > compatible additional requirement runs the risk of generating false- > negatives and incurring many extra person-hours to "fix" > non-problems. > Plus someone needs to document and explain the check (adding to the > rule mountain), plus there is the time spent in discussions like > this..! Bugs in your packages will require users to waste a lot of time too, and possibly reach faulty results with much more serious consequences. Just because perfection may never be attained, this does not mean that progress should not be attempted, in small steps. Compared to Statlib, which basicly followed your recommended approach, CRAN is a vast improvement. Paul > > Mark > > Mark Bravington > CSIRO CMIS > Marine Lab > Hobart > Australia > > From:r-devel-boun...@r-pro
Re: [Rd] CRAN policies
ackages (only 2 on CRAN), I haven't > found RCMD CHECK to be of any use, except for the code-documentation > and example-running bits. I know other people have different > opinions, but that's the point: one-size-does-not-fit-all when it > comes to coding tools. > > And wrto the Warnings themselves: I feel compelled to point out that > it's logically impossible to fully check whether R code will do bad > things. One has to wonder at what point adding new checks becomes > futile or counterproductive. There must be over 2000 people who have > written CRAN packages by now; every extra check and non-back- > compatible additional requirement runs the risk of generating false- > negatives and incurring many extra person-hours to "fix" > non-problems. > Plus someone needs to document and explain the check (adding to the > rule mountain), plus there is the time spent in discussions like > this..! Bugs in your packages will require users to waste a lot of time too, and possibly reach faulty results with much more serious consequences. Just because perfection may never be attained, this does not mean that progress should not be attempted, in small steps. Compared to Statlib, which basicly followed your recommended approach, CRAN is a vast improvement. Paul > > Mark > > Mark Bravington > CSIRO CMIS > Marine Lab > Hobart > Australia > > From:r-devel-boun...@r-project.org [r-devel-boun...@r-project.org] On Behalf Of Hadley Wickham [had...@rice.edu] > Sent: 30 March 2012 07:42 > To: William Dunlap > Cc:r-de...@stat.math.ethz.ch; Spencer Graves > Subject: Re: [Rd] CRAN policies > >> Most of that stuff is already in codetools, at least when it is checking functions >> with checkUsage(). E.g., arguments of ~ are not checked. The expr argument >> to with() will not be checked if you add skipWith=FALSE to the call to checkUsage. >> >> > library(codetools) >> >> > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~ Pred})) >> : no visible binding for global variable 'Num' (:1) >> : no visible binding for global variable 'Den' (:1) >> >> > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~ Pred}), skipWith=TRUE) >> >> > checkUsage(function(dataFrame) with(DataFrame, {Num/Den ; Resp ~ Pred}), skipWith=TRUE) >> : no visible binding for global variable 'DataFrame' >> >> The only part that I don't see is the mechanism to add code-walker functions to >> the environment in codetools that has the standard list of them for functions with >> nonstandard evaluation: >> > objects(codetools:::collectUsageHandlers, all=TRUE) >>[1] "$" "$<-" ".Internal" >>[4] "::"":::" "@" >>[7] "@<-" "{" "~" >> [10] "<-""<<-" "=" >> [13] "assign""binomial" "bquote" >> [16] "data" "detach""expression" >> [19] "for" "function" "Gamma" >> [22] "gaussian" "if""library" >> [25] "local" "poisson" "quasi" >> [28] "quasibinomial" "quasipoisson" "quote" >> [31] "Quote" "require" "substitute" >> [34] "with" > It seems like we really need a standard way to add metadata to functions: > > attr(with, "special_args")<- "expr" > attr(lm, "special_args")<- c("formula", "weights", "subset") > > This would be useful because it could automatically contribute to the > documentation. > > Similarly, > > attr(my.new.method, "s3method")<- c("my.new", "method") > > could be useful. > > Hadley > > > -- > Assistant Professor / Dobelman Family Junior Chair > Department of Statistics / Rice University > http://had.co.nz/ > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
csiro.au> writes: > There must be over 2000 people who have written CRAN packages by now; every extra > check and non-back-compatible additional requirement runs the risk of generating false-negatives and > incurring many extra person-hours to "fix" non-problems. Plus someone needs to document and explain the > check (adding to the rule mountain), plus there is the time spent in discussions like this..! Not sure where you're coming from on that. For example, Prof Ripley has added quite a few new NOTEs to QC.R over the last few months. These caught things I wasn't aware of in the two packages I maintain and I was more than happy to fix them. It improves quality, surely. There's only one particular NOTE causing an issue: 'no visible binding'. If it were made a MEMO, we can move on. All the other NOTEs can (and should) be fixed, can't they? Matthew __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
Paul, > One of the things I have noticed with the R 2.15.0 RC and --as-cran is > that the I have to bump the version number of the working copy of my [snip] > > I am curious how other developers approach this. Regardless of --as-cran I find it very useful to use the date as minor part of the version number (e.g. hyperSpec 0.98-20120320), which I set automatically. Claudia -- Claudia Beleites Spectroscopy/Imaging Institute of Photonic Technology Albert-Einstein-Str. 9 07745 Jena Germany email: claudia.belei...@ipht-jena.de phone: +49 3641 206-133 fax: +49 2641 206-399 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
It looks like you define a few functions that use substitute() or sys.call() or similar functions to look at the unevaluated argument list. E.g., "cq" <- function( ...) { # Saves putting in quotes! # E.G.: quoted( first, second, third) is the same as c( 'first', 'second', 'third') # wrapping by as.character means cq() returns character(0) not list() as.character( sapply( as.list( match.call( expand.dots=TRUE))[-1], as.character)) } %such.that% and %SUCH.THAT% do similar things. Almost all the complaints from check involve calls to a handful of such functions. If you could tell codetools:::checkUsage that that these functions did nonstandard evaluation on all or some of their arguments then the complaints would go away and other checks for real errors like misspellings would still be done. Another possible part of the problem is that if checkUsage is checking a function like f <- function(x) paste(x, cq(suffix), sep=".") it attributes the out-of-scope suffix problem to 'f' and doesn't mention that the immediate caller is 'cq', so you cannot easily filter output complaints about cq. (CRAN would not do such filtering, but a developer might.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On > Behalf > Of mark.braving...@csiro.au > Sent: Thursday, March 29, 2012 6:30 PM > Cc: r-de...@stat.math.ethz.ch > Subject: Re: [Rd] CRAN policies > > I'm concerned this thread is heading the wrong way, towards techno-fixes for > imaginary > problems. R package-building is already encumbered with a huge set of > complicated > rules, and more instructions/rules eg for metadata would make things worse > not better. > > > RCMD CHECK on the 'mvbutils' package generates over 300 Notes about "no > visible > binding...", which inevitably I just ignore. They arise because RCMD CHECK is > too "stupid" > to understand one of my preferred coding idioms (I'm not going to explain > what-- that's > beside the point). And RCMD CHECK always will be too "stupid" to understand > everything > that a rich language like R might quite reasonably cause experienced coders > to do. > > It should not be CRAN's business how I write my code, or even whether my code > does > what it is supposed to. It might be CRAN's business to try to work out > whether my code > breaks CRAN's policies, eg by causing R to crash horribly-- that's presumably > what > Warnings are for (but see below). And maybe there could be circumstances > where an > automatic check might be "worried" enough to alert the CRANia and require > manual > explanation and emails etc from a developer, but even that seems doomed given > the > growing deluge of packages. > > RCMD CHECK currently functions both as a "sanitizer" for CRAN, and as a > developer-tool. > But the fact that the one programl does both things seems accidental to me, > and I think > this dual-use is muddying the discussion. There's a big distinction between > (i) code-checks > that developers themselves might or might not find useful-- which should be > left to the > developer, and will vary from person to person-- and (ii) code-checks that > CRAN enforces > for its own peace-of-mind. Maybe it's convenient to have both functions in > the same > place, and it'd be fine to use Notes for one and Warnings for the other, but > the different > purposes should surely be kept clear. > > Personally, in building over 10 packages (only 2 on CRAN), I haven't found > RCMD CHECK > to be of any use, except for the code-documentation and example-running bits. > I know > other people have different opinions, but that's the point: > one-size-does-not-fit-all when > it comes to coding tools. > > And wrto the Warnings themselves: I feel compelled to point out that it's > logically > impossible to fully check whether R code will do bad things. One has to > wonder at what > point adding new checks becomes futile or counterproductive. There must be > over 2000 > people who have written CRAN packages by now; every extra check and non-back- > compatible additional requirement runs the risk of generating false-negatives > and > incurring many extra person-hours to "fix" non-problems. Plus someone needs to > document and explain the check (adding to the rule mountain), plus there is > the time > spent in discussions like this..! > > Mark > > Mark Bravington > CSIRO CMIS > Marine Lab > Hobart >
Re: [Rd] CRAN policies
I'll echo Mark's concerns. R _used_ to be a language for "turning ideas into software quickly". Now it is more like "prototyping ideas in software quickly", and then spend a substantial amount of time trying to follow administrative rules to package the code. Quality has its costs. Many of the code checks I find quite useful, but the "no visible binding" one generates lots of nuisance notes for me. I must have a similar coding style to Mark. Kevin On Thu, Mar 29, 2012 at 8:29 PM, wrote: > I'm concerned this thread is heading the wrong way, towards techno-fixes > for imaginary problems. R package-building is already encumbered with a > huge set of complicated rules, and more instructions/rules eg for metadata > would make things worse not better. > > RCMD CHECK on the 'mvbutils' package generates over 300 Notes about "no > visible binding...", which inevitably I just ignore. They arise because > RCMD CHECK is too "stupid" to understand one of my preferred coding idioms > (I'm not going to explain what-- that's beside the point). And RCMD CHECK > always will be too "stupid" to understand everything that a rich language > like R might quite reasonably cause experienced coders to do. > > It should not be CRAN's business how I write my code, or even whether my > code does what it is supposed to. It might be CRAN's business to try to > work out whether my code breaks CRAN's policies, eg by causing R to crash > horribly-- that's presumably what Warnings are for (but see below). And > maybe there could be circumstances where an automatic check might be > "worried" enough to alert the CRANia and require manual explanation and > emails etc from a developer, but even that seems doomed given the growing > deluge of packages. > > RCMD CHECK currently functions both as a "sanitizer" for CRAN, and as a > developer-tool. But the fact that the one programl does both things seems > accidental to me, and I think this dual-use is muddying the discussion. > There's a big distinction between (i) code-checks that developers > themselves might or might not find useful-- which should be left to the > developer, and will vary from person to person-- and (ii) code-checks that > CRAN enforces for its own peace-of-mind. Maybe it's convenient to have both > functions in the same place, and it'd be fine to use Notes for one and > Warnings for the other, but the different purposes should surely be kept > clear. > > Personally, in building over 10 packages (only 2 on CRAN), I haven't found > RCMD CHECK to be of any use, except for the code-documentation and > example-running bits. I know other people have different opinions, but > that's the point: one-size-does-not-fit-all when it comes to coding tools. > > And wrto the Warnings themselves: I feel compelled to point out that it's > logically impossible to fully check whether R code will do bad things. One > has to wonder at what point adding new checks becomes futile or > counterproductive. There must be over 2000 people who have written CRAN > packages by now; every extra check and non-back-compatible additional > requirement runs the risk of generating false-negatives and incurring many > extra person-hours to "fix" non-problems. Plus someone needs to document > and explain the check (adding to the rule mountain), plus there is the time > spent in discussions like this..! > > Mark > > Mark Bravington > CSIRO CMIS > Marine Lab > Hobart > Australia > > From: r-devel-boun...@r-project.org [r-devel-boun...@r-project.org] On > Behalf Of Hadley Wickham [had...@rice.edu] > Sent: 30 March 2012 07:42 > To: William Dunlap > Cc: r-de...@stat.math.ethz.ch; Spencer Graves > Subject: Re: [Rd] CRAN policies > > > Most of that stuff is already in codetools, at least when it is checking > functions > > with checkUsage(). E.g., arguments of ~ are not checked. The expr > argument > > to with() will not be checked if you add skipWith=FALSE to the call to > checkUsage. > > > > > library(codetools) > > > > > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~ > Pred})) > > : no visible binding for global variable 'Num' (:1) > > : no visible binding for global variable 'Den' (:1) > > > > > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~ > Pred}), skipWith=TRUE) > > > > > checkUsage(function(dataFrame) with(DataFrame, {Num/Den ; Resp ~ > Pred}), skipWith=TRUE) > > : no visible binding for global variable 'DataFrame' &g
Re: [Rd] CRAN policies
On Fri, Mar 30, 2012 at 11:41 AM, Kevin Wright wrote: > I'll echo Mark's concerns. R _used_ to be a language for "turning ideas > into software quickly". Now it is more like "prototyping ideas in software > quickly", and then spend a substantial amount of time trying to follow > administrative rules to package the code. ..if you want to submit to CRAN. There are practically zero if you host on your own website. Of course developers are free to do whatever they want and R core does not get to tell them what/how to do it. R core does get a say when you ask them to host your source and build your package binaries. > Quality has its costs. So does using CRAN. If it is not the best solution for your problem, use something else. Hadley uses github from development ggplot2, and with the dev_tools package, it is relatively easy for users to install the source ggplot2 code. Something like that might be appropriate for code/packages wehre you just want to 'turn ideas into software quickly'. There is an extra step required for users to use it, but that makes sense because it weeds out inept users from using code with less quality control. > > Many of the code checks I find quite useful, but the "no visible binding" > one generates lots of nuisance notes for me. I must have a similar coding > style to Mark. > > Kevin > > > On Thu, Mar 29, 2012 at 8:29 PM, wrote: > >> I'm concerned this thread is heading the wrong way, towards techno-fixes >> for imaginary problems. R package-building is already encumbered with a >> huge set of complicated rules, and more instructions/rules eg for metadata >> would make things worse not better. >> >> RCMD CHECK on the 'mvbutils' package generates over 300 Notes about "no >> visible binding...", which inevitably I just ignore. They arise because >> RCMD CHECK is too "stupid" to understand one of my preferred coding idioms >> (I'm not going to explain what-- that's beside the point). And RCMD CHECK >> always will be too "stupid" to understand everything that a rich language >> like R might quite reasonably cause experienced coders to do. >> >> It should not be CRAN's business how I write my code, or even whether my >> code does what it is supposed to. It might be CRAN's business to try to >> work out whether my code breaks CRAN's policies, eg by causing R to crash >> horribly-- that's presumably what Warnings are for (but see below). And >> maybe there could be circumstances where an automatic check might be >> "worried" enough to alert the CRANia and require manual explanation and >> emails etc from a developer, but even that seems doomed given the growing >> deluge of packages. >> >> RCMD CHECK currently functions both as a "sanitizer" for CRAN, and as a >> developer-tool. But the fact that the one programl does both things seems >> accidental to me, and I think this dual-use is muddying the discussion. >> There's a big distinction between (i) code-checks that developers >> themselves might or might not find useful-- which should be left to the >> developer, and will vary from person to person-- and (ii) code-checks that >> CRAN enforces for its own peace-of-mind. Maybe it's convenient to have both >> functions in the same place, and it'd be fine to use Notes for one and >> Warnings for the other, but the different purposes should surely be kept >> clear. >> >> Personally, in building over 10 packages (only 2 on CRAN), I haven't found >> RCMD CHECK to be of any use, except for the code-documentation and >> example-running bits. I know other people have different opinions, but >> that's the point: one-size-does-not-fit-all when it comes to coding tools. >> >> And wrto the Warnings themselves: I feel compelled to point out that it's >> logically impossible to fully check whether R code will do bad things. One >> has to wonder at what point adding new checks becomes futile or >> counterproductive. There must be over 2000 people who have written CRAN >> packages by now; every extra check and non-back-compatible additional >> requirement runs the risk of generating false-negatives and incurring many >> extra person-hours to "fix" non-problems. Plus someone needs to document >> and explain the check (adding to the rule mountain), plus there is the time >> spent in discussions like this..! >> >> Mark >> >> Mark Bravington >> CSIRO CMIS >> Marine Lab >> Hobart >> Australia >> __
Re: [Rd] CRAN policies
t does-- as could anyone with a bit of creativity. If I had inclination and time, I could do it. In no reasonable sense would it be "better" code, though. So RCMD CHECK is neither a necessary nor sufficient condition for virtue. Inspection of a language as rich as R will never be foolproof. The user simply has to take it on trust that a package does what it claims, or otherwise decide not to use it. How the package does it, is up to the author. My experience of other people's software is: peace-of-mind starts with helpful documentation, and also depends on whether I get a sense from the archives that the author might actually help if I run into something odd. Several well-known packages fail these tests, so I avoid them. Automated checks, beyond a certain limited point which they have probably reached, seem to me to be playing the wrong game. Bill D: [proposal for additional "documentation" mechanism] Thanks for going to the trouble of looking at my code; I certainly appreciate the effort, but your proposal is exactly what I am against! The issue is with the check, not with my code, and as above I do not see why I should need to add elaborate justifications. For some, this particular check (visible-binding) is apparently useful. For others, it's not. So why not just leave it as a Note that people can worry about or not if they want? It should not be of concern to those very busy CRANia people. Joshua W: [CRAN can set its own rules, and if a package doesn't easily fit them, maybe it should be put elsewhere.] Certainly CRAN/R-core (the distinction is shadowy to me) can, and frequently does, decide to do whatever it wants, including decisions about what to host. But it does not follow that every decision taken is axiomatically a Good Thing for R. More effort now goes into R development from people outside R core than inside it (>3000 packages). If a CRAN/Rcore decision entails a lot of work for others to amend code in ways that do not make the code work better, then it doesn't strike me as a good decision. Ditto if perfectly functional code is forced off CRAN, where it is (sort of) easy to find-- it becomes more difficult for the wider R community to get it, and of course it may not get *any* checks that way. NB I am not commenting here on individual aspects of RCMD CHECK etc-- this is a general point about mission creep, helps and hindrances, and balance of workloa! d. Mark Mark Bravington CSIRO CMIS Marine Lab Hobart Australia From: Joshua Wiley [jwiley.ps...@gmail.com] Sent: 31 March 2012 06:03 To: Kevin Wright Cc: Bravington, Mark (CMIS, Hobart); r-de...@stat.math.ethz.ch Subject: Re: [Rd] CRAN policies On Fri, Mar 30, 2012 at 11:41 AM, Kevin Wright wrote: > I'll echo Mark's concerns. R _used_ to be a language for "turning ideas > into software quickly". Now it is more like "prototyping ideas in software > quickly", and then spend a substantial amount of time trying to follow > administrative rules to package the code. ..if you want to submit to CRAN. There are practically zero if you host on your own website. Of course developers are free to do whatever they want and R core does not get to tell them what/how to do it. R core does get a say when you ask them to host your source and build your package binaries. > Quality has its costs. So does using CRAN. If it is not the best solution for your problem, use something else. Hadley uses github from development ggplot2, and with the dev_tools package, it is relatively easy for users to install the source ggplot2 code. Something like that might be appropriate for code/packages wehre you just want to 'turn ideas into software quickly'. There is an extra step required for users to use it, but that makes sense because it weeds out inept users from using code with less quality control. > > Many of the code checks I find quite useful, but the "no visible binding" > one generates lots of nuisance notes for me. I must have a similar coding > style to Mark. > > Kevin > > > On Thu, Mar 29, 2012 at 8:29 PM, wrote: > >> I'm concerned this thread is heading the wrong way, towards techno-fixes >> for imaginary problems. R package-building is already encumbered with a >> huge set of complicated rules, and more instructions/rules eg for metadata >> would make things worse not better. >> >> RCMD CHECK on the 'mvbutils' package generates over 300 Notes about "no >> visible binding...", which inevitably I just ignore. They arise because >> RCMD CHECK is too "stupid" to understand one of my preferred coding idioms >> (I'm not going to explain what-- that's beside the point). And RCMD CHECK >> always will be too "stupid" to understand ev
Re: [Rd] CRAN policies
Mark I would like to clarify two specific points. On 12-03-31 04:41 AM, mark.braving...@csiro.au wrote: > ... Someone has subsequently decided that code should look a certain way, and has added a check that isn't in the language itself-- but they haven't thought of everything, and of course they never could. There is a large overlap between people writing the checks and people writing the interpreter. Even though your code may have been working, if your understanding of the language definition is not consistent with that of the people writing the interpreter, there is no guarantee that it will continue to work, and in some cases the way in which it fails could be that it produces spurious results. I am inclined to think of code checks as an additional way to be sure my understanding of the R language is close to that of the people writing the interpreter. It depends on how Notes are being interpreted, which from this thread is no longer clear. > The R-core line used to be "Notes are just notes" but now we seem to have "significant Notes" and ... My understanding, and I think that of a few other people, was incorrect, in that I thought some notes were intended always to remain as notes, and others were more serious in that they would eventually become warnings or errors. I think Uwe addressed this misunderstanding by saying that all notes are intended to become warnings or errors. In several cases the reason they are not yet warnings or errors is that the checks are not yet good enough, they produce too many false positives. So, this means that it is very important for us to look at the notes and to point out the reasons for the false positives, otherwise they may become warnings or errors without being recognised as such. > ... Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On Sat, Mar 31, 2012 at 9:57 AM, Paul Gilbert wrote: > Mark > > I would like to clarify two specific points. > > On 12-03-31 04:41 AM, mark.braving...@csiro.au wrote: >> ... > >> Someone has subsequently decided that code should look a certain way, and >> has added a check that >> isn't in the language itself-- but they haven't thought of everything, and >> of course they never could. > > > There is a large overlap between people writing the checks and people > writing the interpreter. Even though your code may have been working, if > your understanding of the language definition is not consistent with that of > the people writing the interpreter, there is no guarantee that it will > continue to work, and in some cases the way in which it fails could be that > it produces spurious results. I am inclined to think of code checks as an > additional way to be sure my understanding of the R language is close to > that of the people writing the interpreter. The point is that it has been historically possible to push R in different directions even without the blessing of the core team but if its locked down too tightly then we lose that facility and its that loss or potential loss that is worrying. The idea of the package system is that it should be possible to extend R without having to modify the core of R itself. >> It depends on how Notes are being interpreted, which from this thread is >> no longer clear. > >> The R-core line used to be "Notes are just notes" but now we seem to have >> "significant Notes" and ... > > My understanding, and I think that of a few other people, was incorrect, in I don't think so. I think it was changed on us and I think it ought to be changed back. Some people on this thread seem to be framing this as a quality issue but its nothing of the sort. The specifics cited make it clear that the current handling of Notes is not improving the quality of any package but is potentially causing thousands of package developers needless work on packages that have been working for years. If the Notes are just there to be helpful that is one thing but changing the idea of Notes so that an undefined subset of them are arbitrarily imposed at the whim of the R core group is what is objectionable. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
> -Original Message- > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] > On Behalf Of Paul Gilbert > Sent: March-31-12 9:57 AM > To: mark.braving...@csiro.au > Cc: r-de...@stat.math.ethz.ch > Subject: Re: [Rd] CRAN policies > Greetings all > Mark > > I would like to clarify two specific points. > > On 12-03-31 04:41 AM, mark.braving...@csiro.au wrote: > > ... > > Someone has subsequently decided that code should look a certain way, > > and has added a check that isn't in the language itself-- but they haven't > thought of everything, and of course they never could. > > There is a large overlap between people writing the checks and people writing > the interpreter. Even though your code may have been working, if your > understanding of the language definition is not consistent with that of the > people writing the interpreter, there is no guarantee that it will continue to > work, and in some cases the way in which it fails could be that it produces > spurious results. I am inclined to think of code checks as an additional way to be > sure my understanding of the R language is close to that of the people writing > the interpreter. > > > It depends on how Notes are being interpreted, which from this thread is no > longer clear. > > The R-core line used to be "Notes are just notes" but now we seem to have > "significant Notes" and ... > > My understanding, and I think that of a few other people, was incorrect, in that > I thought some notes were intended always to remain as notes, and others > were more serious in that they would eventually become warnings or errors. I > think Uwe addressed this misunderstanding by saying that all notes are > intended to become warnings or errors. In several cases the reason they are > not yet warnings or errors is that the checks are not yet good enough, they > produce too many false positives. > So, this means that it is very important for us to look at the notes and to point > out the reasons for the false positives, otherwise they may become warnings or > errors without being recognised as such. > I left the above intact as it nicely illustrates what much of this discussion reminds me of. Let me illustrate with the question of software development in one of my favourite languages: C++. The first issue to consider is, "What is the language definition and who decides?" Believe it or not, there are two answers from two very different perspectives. The first is favoured by language lawyers, who point to the ANSI standard, and who will argue incessantly about the finest of details. But to understand this, you have to understand what ANSI is: it is an industry organization and to construct the standard, they have industry representatives gathered, divided up into subcommittees each of which is charged with defining the language. And of course everyone knows that, being human, they can get it wrong, and thus ANSI standards evolve ever so slowly through time. To my mind, that is not much different from what R/core or Cran are involved in. But the other answer comes from the perspective of a professional software developer, and that is, that the final arbiter of what the language is is your compiler. If you want to get product out the door, it doesn't matter if the standard says 'X' if the compiler doesn't support it, or worse, implements it incorrectly. Most compilers have warnings and errors, and I like the idea of extending that to have notes, but that is a matter of taste vs pragmatism. I know many software developers that choose to ignore warnings and fix only the errors. Their rationale is that it takes time they don't have to fix the warnings too. And I know others who treat all warnings as errors unless they have discovered that there is a compiler bug that generates spurious warnings of a particular kind (in which case that specific warning can usually be turned off). Guess which group has lower bug rates on average. I tend to fall in the latter group, having observed that with many of these things, you either fix them now or you will fix them, at greater cost, later. The second issue to consider is, "What constitutes good code, and what is necessary to produce it?" That I won't answer beyond saying, 'whatever works.' That is because it is ultimately defined by the end users' requirements. that is why we have software engineers who specialize in requirements engineering. these are bright people who translate the wish lists of non-technical users into functional and environmental requirements, that the rest of us can code to. But before we begin coding, we have QA specialists that design a variety of tests from finely focussed unit tests through integration tests to broa
Re: [Rd] CRAN policies
Hi, Ted: Thank you for the most eloquent and complete description of the problem and opportunity I've seen in a while. Might you have time to review the Wikipedia articles on "Package development process" and "Software repository" (http://en.wikipedia.org/wiki/Package_development_process; http://en.wikipedia.org/wiki/Software_repository) and share with me your reactions? I wrote the "Package development process" article and part of the "Software repository" article, because the R package development process is superior to similar processes I've seen for other languages. However, I'm not a leading researcher on these issues, and your comments suggest that you know far more than I about this. Humanity might benefit from your review of these articles. (If you have any changes you might like to see, please make them or ask me to make them. Contributing to Wikipedia can be a very high leverage activity, as witnessed by the fact that the Wikipedia article on SOPA received a million views between the US holidays of Thanksgiving and Christmas last year.) Thanks again, Spencer On 3/31/2012 8:29 AM, Ted Byers wrote: -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Paul Gilbert Sent: March-31-12 9:57 AM To: mark.braving...@csiro.au Cc: r-de...@stat.math.ethz.ch Subject: Re: [Rd] CRAN policies Greetings all Mark I would like to clarify two specific points. On 12-03-31 04:41 AM, mark.braving...@csiro.au wrote: > ... Someone has subsequently decided that code should look a certain way, and has added a check that isn't in the language itself-- but they haven't thought of everything, and of course they never could. There is a large overlap between people writing the checks and people writing the interpreter. Even though your code may have been working, if your understanding of the language definition is not consistent with that of the people writing the interpreter, there is no guarantee that it will continue to work, and in some cases the way in which it fails could be that it produces spurious results. I am inclined to think of code checks as an additional way to be sure my understanding of the R language is close to that of the people writing the interpreter. It depends on how Notes are being interpreted, which from this thread is no longer clear. > The R-core line used to be "Notes are just notes" but now we seem to have "significant Notes" and ... My understanding, and I think that of a few other people, was incorrect, in that I thought some notes were intended always to remain as notes, and others were more serious in that they would eventually become warnings or errors. I think Uwe addressed this misunderstanding by saying that all notes are intended to become warnings or errors. In several cases the reason they are not yet warnings or errors is that the checks are not yet good enough, they produce too many false positives. So, this means that it is very important for us to look at the notes and to point out the reasons for the false positives, otherwise they may become warnings or errors without being recognised as such. I left the above intact as it nicely illustrates what much of this discussion reminds me of. Let me illustrate with the question of software development in one of my favourite languages: C++. The first issue to consider is, "What is the language definition and who decides?" Believe it or not, there are two answers from two very different perspectives. The first is favoured by language lawyers, who point to the ANSI standard, and who will argue incessantly about the finest of details. But to understand this, you have to understand what ANSI is: it is an industry organization and to construct the standard, they have industry representatives gathered, divided up into subcommittees each of which is charged with defining the language. And of course everyone knows that, being human, they can get it wrong, and thus ANSI standards evolve ever so slowly through time. To my mind, that is not much different from what R/core or Cran are involved in. But the other answer comes from the perspective of a professional software developer, and that is, that the final arbiter of what the language is is your compiler. If you want to get product out the door, it doesn't matter if the standard says 'X' if the compiler doesn't support it, or worse, implements it incorrectly. Most compilers have warnings and errors, and I like the idea of extending that to have notes, but that is a matter of taste vs pragmatism. I know many software developers that choose to ignore warnings and fix only the errors. Their rationale is that it takes time they don't have to fix the warnings too. And I know others who treat al
Re: [Rd] CRAN policies
> -Original Message- > From: Spencer Graves [mailto:spencer.gra...@prodsyse.com] > Sent: March-31-12 1:56 PM > To: Ted Byers > Cc: 'Paul Gilbert'; mark.braving...@csiro.au; r-de...@stat.math.ethz.ch > Subject: Re: [Rd] CRAN policies > > Hi, Ted: > > >Thank you for the most eloquent and complete description of the problem > and opportunity I've seen in a while. > To paraphrase and flagrantly plagiarize a better scholar than I, 'If I have seen farther, it is because I stand on the shoulders of giants.' No really, I have been doing this since the stone age, when we used rocks, or marks cut into sticks, or knots tied in string made from hemp, as our computing devices. And the extent to which most of us could count was '1,2,3, many' ;-) Might I suggest an additional essay for you about the place of documentation in quality software production? We all know the benefits of design documentation, but documentation intended for users is, in my view, critical. In my view, though, I have a successful interface if users find it so intuitive that they have no need for the wonderful documentation I write. I'll say no more but to give an example of the best documentation of a software product I have seen in more than 30 years (no, I wrote neither it nor the software it describes): http://eigen.tuxfamily.org/dox/index.html. It is so nice to be able to commend someone who has done well! Eigen is a C++ library supporting very efficient and fast matrix algebra, and then some. GSL is another very good example: http://www.gnu.org/software/gsl/manual/html_node/ but not quite as good, in my view, as Eigen There is a SCM product, primarily Unix, though it does build under Cygwin, called Aegis. The last I looked, it had a nice explanation of the protocol of testing, and ensuring that everything builds and passes all tests before adding new or revised code to the codebase. There may be support for it in more recent products like GIT or Subversion, but to be honest I haven't had the time to look. To gather material for requirements gathering, and use of that to guide QA processes and the design of one of the several suites of tests a project usually needs, the place where the best info is in the many references dealing with UML. You have made a good start on those pages, but it needs to be fleshed out. I do not recommend making either of them longer than 50% more than their current length. Rather, I suggest fleshing it out hypertext fashion, by adding (links to) pages dealing with different issues in more detail than is possible in an executive summary. But, overall, well done. Cheers Ted > >Might you have time to review the Wikipedia articles on "Package > development process" and "Software repository" > (http://en.wikipedia.org/wiki/Package_development_process; > http://en.wikipedia.org/wiki/Software_repository) and share with me your > reactions? > > >I wrote the "Package development process" article and part of the > "Software repository" article, because the R package development process > is superior to similar processes I've seen for other languages. > However, I'm not a leading researcher on these issues, and your comments > suggest that you know far more than I about this. Humanity might > benefit from your review of these articles. (If you have any changes > you might like to see, please make them or ask me to make them. > Contributing to Wikipedia can be a very high leverage activity, as > witnessed by the fact that the Wikipedia article on SOPA received a > million views between the US holidays of Thanksgiving and Christmas last > year.) > > >Thanks again, >Spencer > > > On 3/31/2012 8:29 AM, Ted Byers wrote: > >> -Original Message- > >> From: r-devel-boun...@r-project.org [mailto:r-devel-bounces@r- > project.org] > >> On Behalf Of Paul Gilbert > >> Sent: March-31-12 9:57 AM > >> To: mark.braving...@csiro.au > >> Cc: r-de...@stat.math.ethz.ch > >> Subject: Re: [Rd] CRAN policies > >> > > Greetings all > > > >> Mark > >> > >> I would like to clarify two specific points. > >> > >> On 12-03-31 04:41 AM, mark.braving...@csiro.au wrote: > >> > ... > >>> Someone has subsequently decided that code should look a certain way, > >>> and has added a check that isn't in the language itself-- but they > > haven't > >> thought of everything, and of course they never could. > >> > >> There is a large overlap between people writing the checks and people > > writing > >> the interpreter. Even though your code may hav
Re: [Rd] CRAN policies
>>>>> William Dunlap >>>>> on Fri, 30 Mar 2012 16:07:52 + writes: > It looks like you define a few functions that use substitute() or sys.call() > or similar functions to look at the unevaluated argument list. E.g., > "cq" <- > function( ...) { > # Saves putting in quotes! > # E.G.: quoted( first, second, third) is the same as c( 'first', 'second', 'third') > # wrapping by as.character means cq() returns character(0) not list() > as.character( sapply( as.list( match.call( expand.dots=TRUE))[-1], as.character)) > } > %such.that% and %SUCH.THAT% do similar things. > Almost all the complaints from check involve calls to a > handful of such functions. If you could tell > codetools:::checkUsage that that these functions did > nonstandard evaluation on all or some of their arguments > then the complaints would go away and other checks for > real errors like misspellings would still be done. I agree very much with you, Bill. Many (if not the majority) of my packages have given these false positive notes for many months now... and I have to admit that the effect indeed has been that I take notes much less seriously nowadays. This of course has never been the intention. I'm pretty sure that most of us agree that it would be very useful if not desirable to have a simple and robust way for package authors to declare nonstandard evaluation to the checkUsage() checks. Maybe we should branch a new thread about this, for proposals on how to go about this. Martin > Another possible part of the problem is that if checkUsage > is checking a function like > f <- function(x) paste(x, cq(suffix), sep=".") > it attributes the out-of-scope suffix problem to 'f' and doesn't mention that the immediate > caller is 'cq', so you cannot easily filter output complaints about cq. (CRAN would > not do such filtering, but a developer might.) > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com >> -Original Message- >> From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf >> Of mark.braving...@csiro.au >> Sent: Thursday, March 29, 2012 6:30 PM >> Cc: r-de...@stat.math.ethz.ch >> Subject: Re: [Rd] CRAN policies >> >> I'm concerned this thread is heading the wrong way, towards techno-fixes for imaginary >> problems. R package-building is already encumbered with a huge set of complicated >> rules, and more instructions/rules eg for metadata would make things worse not better. >> >> >> RCMD CHECK on the 'mvbutils' package generates over 300 Notes about "no visible >> binding...", which inevitably I just ignore. They arise because RCMD CHECK is too "stupid" >> to understand one of my preferred coding idioms (I'm not going to explain what-- that's >> beside the point). And RCMD CHECK always will be too "stupid" to understand everything >> that a rich language like R might quite reasonably cause experienced coders to do. >> >> It should not be CRAN's business how I write my code, or even whether my code does >> what it is supposed to. It might be CRAN's business to try to work out whether my code >> breaks CRAN's policies, eg by causing R to crash horribly-- that's presumably what >> Warnings are for (but see below). And maybe there could be circumstances where an >> automatic check might be "worried" enough to alert the CRANia and require manual >> explanation and emails etc from a developer, but even that seems doomed given the >> growing deluge of packages. >> >> RCMD CHECK currently functions both as a "sanitizer" for CRAN, and as a developer-tool. >> But the fact that the one programl does both things seems accidental to me, and I think >> this dual-use is muddying the discussion. There's a big distinction between (i) code-checks >> that developers themselves might or might not find useful-- which should be left to the >> developer, and will vary from person to person-- and (ii) code-checks that CRAN enforces >> for its own peace-of-mind. Maybe it's convenient to have both functions in the same >> place, and it'd be fine to use Notes for one and Warnings for the other, but the different >> purposes should surely be kept clear. >> >> Personally, in building over 10 packages (only 2 on CRAN