Re: [Rd] CRAN policies

2012-03-28 Thread jing hua zhao

 From: x...@yihui.name
 Date: Tue, 27 Mar 2012 16:40:04 -0500
 To: r-devel@r-project.org
 Subject: Re: [Rd] CRAN policies
 
 I have been wondering if it is possible to automate the checking
 process to reduce human efforts, e.g. automatically check the packages
 submitted to FTP, and send the package maintainer an email in case of
 warnings or errors (otherwise just move it to CRAN); package
 maintainers can appeal for a manual check by CRAN maintainers in case
 of false positives. As a package author, I really hate to bother CRAN
 maintainers each time I upload a new version and it passes R CMD check
 successfully, in which case I should have received an automatic email
 instead of Kurt's hand-writing thanks, on CRAN now. Frankly
 speaking, it makes me feel guilty sometimes to update my packages,
 thinking of other 3700 packages on CRAN and how much time you CRAN
 maintainers are spending on checking the packages.
 


Indeed it is a good summary of how I felt for so long and in particular my 
recent experience, which involved Kurt, Brian,  and Uwe.



I think win-builder certainly helps, but it is feasible with a Linux 
counterpart to have a final say?



 I do not know how many package authors actually read this mailing
 list, so these policies may not really reach some authors at all.
 
Certainly more colleagues read the list than  have been revealed by the 
postings.

Kind regards,





Jing Hua



 Regards,
 Yihui
 --
 Yihui Xie xieyi...@gmail.com
 Phone: 515-294-2465 Web: http://yihui.name
 Department of Statistics, Iowa State University
 2215 Snedecor Hall, Ames, IA
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
  
[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-28 Thread Uwe Ligges



On 28.03.2012 00:07, Hadley Wickham wrote:

On Tue, Mar 27, 2012 at 6:52 AM, Prof Brian Ripley
rip...@stats.ox.ac.uk  wrote:

CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package maintainers.  In
particular, please


Thanks for the pointer - I did not know that this page existed. In
general, is there some easy way to track changes to this page and the
R extension manual over time?  It is difficult to keep track of the
best practices.

I'd also like to get clarification on Packages should not write in
the users' home filespace, nor anywhere else on the file system apart
from the R session's temporary directory (or during installation in
the location pointed to by TMPDIR: and such usage should be cleaned
up). - what is recommended practice for packages to maintain state
across instances?  Operating systems have standards for where
applications can store settings (e.g. as described in
http://pypi.python.org/pypi/appdirs/1.2.0).  Is it acceptable to for
packages to follow these conventions?



The policy is meant not to overwrite user data or generate loads of 
temporary files from examples and pollute, e.g., the owkring directory.


Uwe Ligges





Hadley



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-28 Thread Uwe Ligges



On 27.03.2012 20:33, Jeffrey Ryan wrote:

Thanks Uwe for the clarification on what goes and what stays.

Still fuzzy on the notion of significant though.  Do you have an example
or two for the list?



We have to look at those notes again and again in order to find if 
something important is noted, hence please always try to avoid all notes 
unless the effect is really intended!



Consider the Note No visible binding for global variable
We cannot know if your code intends to use such a global variable (which 
is undesirable in most cases), hence would let is pass if it seems to be 
sensible.


Another Note such as empty section or partial argument match can 
quickly be fixed, hence just do it and don't waste our time.


Best,
Uwe Ligges




Jeff

P.S.
I meant to also thank all of CRAN volunteers for the momentous efforts
involved, and it is nice to see some explanation of how we can help, as
well as a peek into what goes on 'behind the curtain' ;-)

On 3/27/12 1:19 PM, Uwe Liggeslig...@statistik.tu-dortmund.de  wrote:




On 27.03.2012 19:10, Jeffrey Ryan wrote:

Is there a distinction as to NOTE vs. WARNING that is documented?  I've
always assumed (wrongly?) that NOTES weren't an issue with publishing on
CRAN, but that they may change to WARNINGS at some point.


We won't kick packages off CRAN for Notes (but we will if Warnings are
not fixed), but we may not accept new submissions with significant Notes.

Best,
Uwe Ligges




Is the process by which this happens documented somewhere?

Jeff

On 3/27/12 11:09 AM, Gabor Grothendieckggrothendi...@gmail.com
wrote:


2012/3/27 Uwe Liggeslig...@statistik.tu-dortmund.de:



On 27.03.2012 17:09, Gabor Grothendieck wrote:


On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley
rip...@stats.ox.ac.ukwrote:


CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package
maintainers.
   In
particular, please

- always send a submission email to c...@r-project.org with the
package
name and version on the subject line.  Emails sent to individual
members
of
the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it.  Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel.  (Later versions of R
are
able to give better diagnostics, e.g. for compiled code and
especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages
were
published last week) and to remain viable needs package maintainers
to
make
its life as easy as possible.



Regarding the part about warnings or significant notes in that
page,
its impossible to know which notes are significant and which ones are
not significant except by trial and error.




Right, it needs human inspection to identify false positives. We
believe
most package maintainers are able to see if he or she is hit by such a
false
positive.


The problem is that a note is generated and the note is correct. Its
not a false positive.  But that does not tell you whether its
significant or not.  There is no way to know.  One can either try to
remove all notes (which may not be feasible) or just upload it and by
trial and error find out if its accepted or not.

--
Statistics   Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-28 Thread Uwe Ligges



On 27.03.2012 20:36, Gabor Grothendieck wrote:

2012/3/27 Uwe Liggeslig...@statistik.tu-dortmund.de:



On 27.03.2012 19:10, Jeffrey Ryan wrote:


Is there a distinction as to NOTE vs. WARNING that is documented?  I've
always assumed (wrongly?) that NOTES weren't an issue with publishing on
CRAN, but that they may change to WARNINGS at some point.



We won't kick packages off CRAN for Notes (but we will if Warnings are not
fixed), but we may not accept new submissions with significant Notes.


Yes, I understand that but that does not really address the problem
that one has no idea of whether a Note is significant or not so the
only way to determine its significance is to submit your package and
see if its accepted or not.



We have to look at those notes again and again in order to find if 
something important is noted, hence please always try to avoid all notes 
unless the effect is really intended!



Consider the Note No visible binding for global variable
We cannot know if your code intends to use such a global variable (which 
is undesirable in most cases), hence would let is pass if it seems to be 
sensible.


Another Note such as empty section or partial argument match can 
quickly be fixed, hence just do it and don't waste our time.


Best,
Uwe Ligges

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-28 Thread Gabor Grothendieck
2012/3/28 Uwe Ligges lig...@statistik.tu-dortmund.de:


 On 27.03.2012 20:33, Jeffrey Ryan wrote:

 Thanks Uwe for the clarification on what goes and what stays.

 Still fuzzy on the notion of significant though.  Do you have an example
 or two for the list?



 We have to look at those notes again and again in order to find if something
 important is noted, hence please always try to avoid all notes unless the
 effect is really intended!


 Consider the Note No visible binding for global variable
 We cannot know if your code intends to use such a global variable (which is
 undesirable in most cases), hence would let is pass if it seems to be
 sensible.

 Another Note such as empty section or partial argument match can quickly
 be fixed, hence just do it and don't waste our time.

 Best,
 Uwe Ligges

What is the point of notes vs warnings if you have to get rid of both
of them?  Furthermore, if there are notes that you don't have to get
rid of its not fair that package developers should have to waste their
time on things that are actually acceptable.  Finally, it makes the
whole system arbitrary since packages can be rejected based on
undefined rules.

Either divide notes into significant notes and ordinary notes and
clearly label them as such in the output of   R CMD check   or else
make the significant notes warnings so one can know in advance whether
the package passes R CMD check or not.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] serialization regression in 2.15.0 beta

2012-03-28 Thread Ben Goodrich

Quoting Prof Brian Ripley rip...@stats.ox.ac.uk:


On 27/03/2012 22:01, Ben Goodrich wrote:

In case anyone is concerned that this regression will affect them, the
code was reverted to the 2.14.x behavior by


r58842 | ripley | 2012-03-26 08:12:43 -0400 (Mon, 26 Mar 2012) | 1 line
Changed paths:
   M /branches/R-2-15-branch/doc/NEWS.Rd
   M /branches/R-2-15-branch/src/library/parallel/R/unix/forkCluster.R
   M /branches/R-2-15-branch/src/library/parallel/R/unix/mcfork.R

revert to XDR serialization for 2.15.0



But the underlying problem (in non-xdr binary unserialization) is AFAWK
fixed: it was just that at this late stage there was too little time to
test thoroughly before release.

Please test R-devel on your own problem (we haven't: the issue was
found using a different example from elsewhere).


Indeed, the issue seems to be fixed in r-devel for my example.

Thanks,
Ben



I am experiencing a problem related to serialization behavior in
2.15.0 beta (binary installed from Debian unstable) and 2.16.0 (from
svn) that is not present in 2.14.2 (binary from Debian testing).

I don't fully understand the problem. Also, I tried but have not yet
been able to create a small, self-contained example that reproduces
the problem. However, I do have a large, not self-contained example,
which requires an alpha version (not yet on CRAN) of the mi package
(the mi package on CRAN would not exhibit this issue). Anyone
interested in reproducing the problem can follow the readme.txt file
in this directory:

http://www.columbia.edu/~bg2382/mi/serialization/

I track r-devel with git-svn and was able to git bisect to svn   
commit r58219


commit 799102bd9d0266fe89c3120981decf0b1f17ef11
Author: ripleyripley at 00db46b3-68df-0310-9c12-caf00c1e9a41
Date:   Sat Jan 28 15:02:34 2012 +

 make use of non-xdr serialization;.

although this commit could merely expose the problem rather than cause it.

The problem occurs when the FUN called by mclapply() in the parallel
package returns a S4 object that contains a slot (called X) that is a
large matrix, specifically a model matrix similar to that produced
by glm(). Some columns of this matrix get corrupted with wrong values
(usually zero, but sometimes NaN or 10^300ish), which can be seen by
examining X right before FUN returns (to mclapply()'s environment) and
comparing to the same X after mclapply() returns to the calling
environment.

Part of svn commit r58219 is this hunk

diff --git a/src/library/parallel/R/unix/mcfork.R
b/src/library/parallel/R/unix/mcfork.R
index 8e27534..4f92193 100644
--- a/src/library/parallel/R/unix/mcfork.R
+++ b/src/library/parallel/R/unix/mcfork.R
@@ -82,7 +82,8 @@ mckill- function(process, signal = 2L)
  ## used by mcparallel, mclapply
  sendMaster- function(what)
  {
-if (!is.raw(what)) what- serialize(what, NULL, FALSE)
+# This is talking to the same machine, so no point in using xdr.
+if (!is.raw(what)) what- serialize(what, NULL, xdr = FALSE)
  .Call(C_mc_send_master, what, PACKAGE = parallel)
  }

Contrary to the comment, I have found that if I specify xdr = TRUE, I
get the expected (non-corrupted X slot) behavior in 2.16.0, even
though it is forking locally on my 64bit Debian laptop with a little
endian i7 processor, whose specs are

goodrich at CYBERPOWERPC:/tmp/serialization$ cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 42
model name  : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
stepping: 7
microcode   : 0x17
cpu MHz : 800.000
cache size  : 6144 KB
physical id : 0
siblings: 8
core id : 0
cpu cores   : 4
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl
vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt
tsc_deadline_timer xsave avx lahf_lm ida arat epb xsaveopt pln pts dts
tpr_shadow vnmi flexpriority ept vpid
bogomips: 3990.83
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

...

processor   : 7
[same as processor 0]

So, to summarize I get the good behavior on R 2.14.2 when using
mclapply(), on 2.15.0 beta when using lapply(), and on 2.16.0 using
mclapply() iff I patch in xdr = TRUE in sendMaster(). I get the bad
behavior on 2.15.0 beta and unpatched 2.16.0 when using mclapply().

My session info:


sessionInfo()

R version 2.15.0 beta (2012-03-16 r58769)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8  

[Rd] --as-cran / BuildVignettes: false

2012-03-28 Thread Paul Gilbert


I have packages where I know CRAN and other test platforms do not have 
all the resources to build the vignettes, for example, access to 
databases. Previously I think putting


 BuildVignettes: false

in the DESCRIPTION file resolved this, by preventing CRAN checks from 
attempting to run the vignette code. (If it was not this, then there was 
some other magic I don't understand.)


Now, when I specify --as-cran, the checks fail when attempting to check 
R code from vignettes, even though I have BuildVignettes: false in the 
DESCRIPTION file.


What is the mechanism for indicating that CRAN should not attempt to 
check this code?  Perhaps it is intentionally difficult - I can see an 
argument for that.  (For running tests there are environment variables, 
e.g._R_CHECK_HAVE_MYSQL_, but using these really clutters up a vignette, 
and it did not seem necessary to use them before.)


(The difficult also occurs on R-forge, possibly because it is using 
--as-cran like settings.)


Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] --as-cran / BuildVignettes: false

2012-03-28 Thread Uwe Ligges



On 28.03.2012 18:07, Paul Gilbert wrote:


I have packages where I know CRAN and other test platforms do not have
all the resources to build the vignettes, for example, access to
databases. Previously I think putting

BuildVignettes: false

in the DESCRIPTION file resolved this, by preventing CRAN checks from
attempting to run the vignette code. (If it was not this, then there was
some other magic I don't understand.)

Now, when I specify --as-cran, the checks fail when attempting to check
R code from vignettes, even though I have BuildVignettes: false in the
DESCRIPTION file.


Paul, it says BuiltVignettes rather than CheckVignettes.
If you want CRAN to disable those checks for some very good reason, 
please tell the CRAN maintainers, they will move your package to the 
exclude list for vignette checking.


Best,
Uwe




What is the mechanism for indicating that CRAN should not attempt to
check this code? Perhaps it is intentionally difficult - I can see an
argument for that. (For running tests there are environment variables,
e.g._R_CHECK_HAVE_MYSQL_, but using these really clutters up a vignette,
and it did not seem necessary to use them before.)

(The difficult also occurs on R-forge, possibly because it is using
--as-cran like settings.)

Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-28 Thread Uwe Ligges



On 28.03.2012 16:30, Gabor Grothendieck wrote:

2012/3/28 Uwe Liggeslig...@statistik.tu-dortmund.de:



On 27.03.2012 20:33, Jeffrey Ryan wrote:


Thanks Uwe for the clarification on what goes and what stays.

Still fuzzy on the notion of significant though.  Do you have an example
or two for the list?




We have to look at those notes again and again in order to find if something
important is noted, hence please always try to avoid all notes unless the
effect is really intended!


Consider the Note No visible binding for global variable
We cannot know if your code intends to use such a global variable (which is
undesirable in most cases), hence would let is pass if it seems to be
sensible.

Another Note such as empty section or partial argument match can quickly
be fixed, hence just do it and don't waste our time.

Best,
Uwe Ligges


What is the point of notes vs warnings if you have to get rid of both
of them?  Furthermore, if there are notes that you don't have to get
rid of its not fair that package developers should have to waste their
time on things that are actually acceptable.  Finally, it makes the
whole system arbitrary since packages can be rejected based on
undefined rules.

Either divide notes into significant notes and ordinary notes and
clearly label them as such in the output of   R CMD check   or else
make the significant notes warnings so one can know in advance whether
the package passes R CMD check or not.




I tried to make clear that we cannot decide that automatically and it 
needs human inspection and thinking if some Note is significant or not. 
That why we have not made them Warnings where we are sure things have to 
be fixed.


Please always try to avoid all notes unless the effect is really 
intended! How hard can it be? If Notes could be completely ignored, they 
would not be Notes.


Uwe

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-28 Thread Thomas Lumley
On Thu, Mar 29, 2012 at 3:30 AM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 2012/3/28 Uwe Ligges lig...@statistik.tu-dortmund.de:


 On 27.03.2012 20:33, Jeffrey Ryan wrote:

 Thanks Uwe for the clarification on what goes and what stays.

 Still fuzzy on the notion of significant though.  Do you have an example
 or two for the list?



 We have to look at those notes again and again in order to find if something
 important is noted, hence please always try to avoid all notes unless the
 effect is really intended!


 Consider the Note No visible binding for global variable
 We cannot know if your code intends to use such a global variable (which is
 undesirable in most cases), hence would let is pass if it seems to be
 sensible.

 Another Note such as empty section or partial argument match can quickly
 be fixed, hence just do it and don't waste our time.

 Best,
 Uwe Ligges

 What is the point of notes vs warnings if you have to get rid of both
 of them?  Furthermore, if there are notes that you don't have to get
 rid of its not fair that package developers should have to waste their
 time on things that are actually acceptable.  Finally, it makes the
 whole system arbitrary since packages can be rejected based on
 undefined rules.


The notes are precisely the things for which clear rules can't be
written.  They are reported by CMD check because they are usually
signs of coding errors, but are not warnings because their use is
sometimes justified.

The 'No visible binding for global variable is a good example.  This
found some bugs in my 'survey' package, which I removed. There is
still one note of this type, which arises when I have to handle two
different versions of the hexbin package with different internal
structures.  The note is a false positive because the use is guarded
by an if(), but  CMD check can't tell this.   So, it's a good idea to
remove all Notes that can be removed without introducing other code
problems, which is nearly all of them, but occasionally there may be a
good reason for code that produces a Note.

But if you want a simple, unambiguous, mechanical rule for *your*
packages, just eliminate all Notes.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel