Re: [Rd] Problem with table

2012-03-27 Thread Prof Brian Ripley

On 19/03/2012 17:01, Terry Therneau wrote:

R version 2.14.0, started with --vanilla

  table(c(1,2,3,4,NA), exclude=2, useNA='ifany')
1 3 4 NA
1 1 1 2

This came from a local user who wanted to remove one particular response
from some tables, but also wants to have NA always reported for data
checking purposes.
I don't think the above is what anyone would want.


You have not told us what you want!

Try

  table(as.factor(c(1,2,3,4,NA)), exclude=2, useNA='ifany')

   134 NA
   1111

Note carefully how 'exclude' is defined:

 exclude: levels to remove from all factors in ‘...’. If set to ‘NULL’,
  it implies ‘useNA=always’.

As you did not specify a factor, 'exclude' was used in forming the 'levels'.



PS.
This is on a background of our local desires, which is to have the
default action of the table command be
to report NA, if present. (It's one of the only commands that we
globally override at Mayo.) The user had
added only the exclude=2 argument, and the useNA value is our default.

The above makes this harder to do without rewriting the command
wholesale, which is ok (we've done it before at
various times in R and Splus) but we would avoid it if possible. Please
no wars about whether this is the right decison or not, we've done it
for 10+ years and quite firmly believe the extra robustness gained by
having NA appear
is worth the maintainance bother, correctness being paramount in medical
research. We're not trying to convert anyone
else, just get feedback on the best way to approach this.


Most likely, feed table() a factor with the properties you want.



Terry T.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] drawing the graph with many nodes

2012-03-27 Thread MSousa
Good morning, 


I'm trying to draw a graph, and I'm using the following code.

test.matrix-read.table(~/Desktop/Results/testgephi.csv, header = T,
sep=,)
colnames(test.matrix) - gsub(X, , colnames(test.matrix)) 
#drop first column
drops - c()
test.matrix-test.matrix[,!(names(test.matrix) %in% drops)]
test.matrix
test.matrix-data.matrix(test.matrix)
am.graph-new(graphAM, adjMat=test.matrix, edgemode=directed)
am.graph
plot(am.graph, attrs = list(node = list(fillcolor = lightblue),edge =
list(arrowsize=0.5)))


The file testgephi.csv is following.
,1,2,3,4,5
1,393,55,66,44,88
2,44,23,47,57,89
3,57,87,98,456,43
4,77,767,86,32,77
5,43,88,23,76,46

  In the example graph of the drawing works well, the problem is when I'm
trying to draw the graph from a file wih A graphAM graph with directed edges
Number of Nodes = 217 
Number of Edges = 32804 

is there any package or tool that can draw a structure like this


   Thanks


--
View this message in context: 
http://r.789695.n4.nabble.com/drawing-the-graph-with-many-nodes-tp4508319p4508319.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] drawing the graph with many nodes

2012-03-27 Thread Whit Armstrong
if you don't mind going outside of R to create it, then check out
Graphviz: http://www.graphviz.org/Gallery.php

you may have to reformat your data a little, but this tool is great
for drawing graphs.

-Whit


On Tue, Mar 27, 2012 at 5:14 AM, MSousa ricardosousa2...@clix.pt wrote:
 Good morning,


 I'm trying to draw a graph, and I'm using the following code.

 test.matrix-read.table(~/Desktop/Results/testgephi.csv, header = T,
 sep=,)
 colnames(test.matrix) - gsub(X, , colnames(test.matrix))
 #drop first column
 drops - c()
 test.matrix-test.matrix[,!(names(test.matrix) %in% drops)]
 test.matrix
 test.matrix-data.matrix(test.matrix)
 am.graph-new(graphAM, adjMat=test.matrix, edgemode=directed)
 am.graph
 plot(am.graph, attrs = list(node = list(fillcolor = lightblue),edge =
 list(arrowsize=0.5)))


 The file testgephi.csv is following.
 ,1,2,3,4,5
 1,393,55,66,44,88
 2,44,23,47,57,89
 3,57,87,98,456,43
 4,77,767,86,32,77
 5,43,88,23,76,46

  In the example graph of the drawing works well, the problem is when I'm
 trying to draw the graph from a file wih A graphAM graph with directed edges
 Number of Nodes = 217
 Number of Edges = 32804

 is there any package or tool that can draw a structure like this


   Thanks


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/drawing-the-graph-with-many-nodes-tp4508319p4508319.html
 Sent from the R devel mailing list archive at Nabble.com.

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] CRAN policies

2012-03-27 Thread Prof Brian Ripley

CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package maintainers. 
 In particular, please


- always send a submission email to c...@r-project.org with the package
name and version on the subject line.  Emails sent to individual members 
of the team will result in delays at best.


- run R CMD check --as-cran on the tarball before you submit it.  Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel.  (Later versions of R are
able to give better diagnostics, e.g. for compiled code and especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages were 
published last week) and to remain viable needs package maintainers to 
make its life as easy as possible.


Kurt Hornik
Uwe Ligges
Brian Ripley

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Typo in ?Logistic

2012-03-27 Thread Martin Maechler
 R Michael Weylandt michael.weyla...@gmail.com
 on Mon, 26 Mar 2012 14:29:31 -0400 writes:

 In the source section of ?rlogis, we see:
 Source:

 ‘[dpr]logis’ are calculated directly from the definitions.

 ‘rlogis’ uses inversion.

 Should that read [dpq]logis instead?

yes, indeed; now fixed.  Thank you very much!
Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] drawing the graph with many nodes

2012-03-27 Thread oliver
Hi,

your code does not run in a fresh R environment:

  Error in getClass(Class, where = topenv(parent.frame())) : 
graphAM is not a defined class

If you don't provide working code, it's (too) much effort to help.

There are some graph packages arround.
Which you need depends on what you want to do.
I can't decide that easily, without seeing your example running.


Ciao,
   Oliver

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with table

2012-03-27 Thread Terry Therneau

On 03/27/2012 02:05 AM, Prof Brian Ripley wrote:

n 19/03/2012 17:01, Terry Therneau wrote:

R version 2.14.0, started with --vanilla

 table(c(1,2,3,4,NA), exclude=2, useNA='ifany')
1 3 4 NA
1 1 1 2

This came from a local user who wanted to remove one particular response
from some tables, but also wants to have NA always reported for data
checking purposes.
I don't think the above is what anyone would want.


You have not told us what you want!
Want: that the resulting table exclude values of 2 from the printout, 
while still reporting NA.  This is what the local user expected, the one 
who came to me with their query.


There are lots of ways to get the program to do the right thing, the 
simplest is

 table(c(1,2,3,4,NA), exclude=2) # keeping the default for useNA

You show another below.



Try

  table(as.factor(c(1,2,3,4,NA)), exclude=2, useNA='ifany')

   134 NA
   1111

Note carefully how 'exclude' is defined:

 exclude: levels to remove from all factors in ‘...’. If set to ‘NULL’,
  it implies ‘useNA=always’.

As you did not specify a factor, 'exclude' was used in forming the 
'levels'.


That is almost a legal loophole reading of the manual.  I would never 
have seen through to that level of subtlety.  A primary reason is that a 
simple test shows that exclude works on non-factors.


I'm not sure what the best course of action is.  What I've reported is a 
case where use of the options in a fairly obvious way gives an 
unexpected answer.  On the other hand, I have never  before seen or 
considered the case where someone wanted to exclude an actual data level 
from table: I myself would always have removed a column from the 
result.   If fixing this causes other problems, then perhaps we just 
give up on this rare case.


As to our local choices, we figured out a way to make display of NA the 
default without causing the above problem.   As is often the case, a 
fairly simple solution became obvious to us about 30 minutes after 
submitting a question to the list.


Terry T.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] PROTECT help

2012-03-27 Thread Terry Therneau
I received the following note this AM.  The problem is, I'm not quite 
sure how to fix it.
Can one use PROTECT(coxlist(eval(PROTECT , do I create an 
intermediate variable, or otherwise?


I'm willing to update the code if someone will give me a pointer to the 
right documentation.  This particular chunk was written when there was a 
lot of change going on in the callback mechanism and so there might be a 
safer and/or simpler and/or more standard aproach by now. The routine in 
question has to do with penalized Cox models, the C code needs to get 
the value of the penalty and the penalty is an arbitrary S expression 
passed down from top level.


Terry T



In survival_2.36-12 (and earlier), in the function cox_callback() at
cox_Rcallback.c:40:

PROTECT(coxlist=eval(lang2(fexpr,data),rho));

the return value of the call to lang2() is vulnerable if allocations
within eval() give rise to garbage collection.

(Discovered during CXXR development.)

Andrew

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] PROTECT help

2012-03-27 Thread Prof Brian Ripley

On 27/03/2012 14:22, Terry Therneau wrote:

I received the following note this AM. The problem is, I'm not quite
sure how to fix it.
Can one use PROTECT(coxlist(eval(PROTECT , do I create an
intermediate variable, or otherwise?


You can, but I find it easiest to follow if you create an intermediate 
variable.  Look for example at unique.c:


SEXP call, r;
PROTECT(call = lang2(install(as.character), s));
PROTECT(r = eval(call, env));
UNPROTECT(2);
return r;





I'm willing to update the code if someone will give me a pointer to the
right documentation. This particular chunk was written when there was a
lot of change going on in the callback mechanism and so there might be a
safer and/or simpler and/or more standard aproach by now. The routine in
question has to do with penalized Cox models, the C code needs to get
the value of the penalty and the penalty is an arbitrary S expression
passed down from top level.

Terry T



In survival_2.36-12 (and earlier), in the function cox_callback() at
cox_Rcallback.c:40:

PROTECT(coxlist=eval(lang2(fexpr,data),rho));

the return value of the call to lang2() is vulnerable if allocations
within eval() give rise to garbage collection.

(Discovered during CXXR development.)

Andrew

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] PROTECT help

2012-03-27 Thread Duncan Murdoch

On 12-03-27 9:22 AM, Terry Therneau wrote:

I received the following note this AM.  The problem is, I'm not quite
sure how to fix it.
Can one use PROTECT(coxlist(eval(PROTECT , do I create an
intermediate variable, or otherwise?


I think both would work.  The usual style in R sources is to use an 
intermediate variable, assigned within the PROTECT call, e.g.


PROTECT(var = f());

but this would act the same as

var = PROTECT(f());

I don't know where the best docs are, but here is my understanding of 
PROTECT:


What PROTECT(x) does is to make a copy of the pointer x in a stack of 
protected pointers.  When garbage collection happens, nothing in that 
stack will be released.  It is safe to protect things that don't need 
protection, but it is a little inefficient.  (You shouldn't call 
PROTECT() on a pointer that isn't an R object declared as a SEXP, but it 
will only cause trouble in certain debugging modes.)


PROTECT(x) does return the value of x, so f(PROTECT(x)) should evaluate 
the same as f(x) (but x will be protected from collection).


The main thing to watch when you use PROTECT is that you keep track of 
how many times it is called, because UNPROTECT just pops a number of 
pointers off the protection stack.





I'm willing to update the code if someone will give me a pointer to the
right documentation.  This particular chunk was written when there was a
lot of change going on in the callback mechanism and so there might be a
safer and/or simpler and/or more standard aproach by now. The routine in
question has to do with penalized Cox models, the C code needs to get
the value of the penalty and the penalty is an arbitrary S expression
passed down from top level.

Terry T



In survival_2.36-12 (and earlier), in the function cox_callback() at
cox_Rcallback.c:40:

  PROTECT(coxlist=eval(lang2(fexpr,data),rho));

the return value of the call to lang2() is vulnerable if allocations
within eval() give rise to garbage collection.

(Discovered during CXXR development.)

Andrew

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] PROTECT help

2012-03-27 Thread Terry Therneau


Brian  Duncan:
  Thanks.  This was exactly what I needed to know.

Terry

On 03/27/2012 08:41 AM, Prof Brian Ripley wrote:

On 27/03/2012 14:22, Terry Therneau wrote:

I received the following note this AM. The problem is, I'm not quite
sure how to fix it.
Can one use PROTECT(coxlist(eval(PROTECT , do I create an
intermediate variable, or otherwise?


You can, but I find it easiest to follow if you create an intermediate 
variable.  Look for example at unique.c:


SEXP call, r;
PROTECT(call = lang2(install(as.character), s));
PROTECT(r = eval(call, env));
UNPROTECT(2);
return r;





I'm willing to update the code if someone will give me a pointer to the
right documentation. This particular chunk was written when there was a
lot of change going on in the callback mechanism and so there might be a
safer and/or simpler and/or more standard aproach by now. The routine in
question has to do with penalized Cox models, the C code needs to get
the value of the penalty and the penalty is an arbitrary S expression
passed down from top level.

Terry T



In survival_2.36-12 (and earlier), in the function cox_callback() at
cox_Rcallback.c:40:

PROTECT(coxlist=eval(lang2(fexpr,data),rho));

the return value of the call to lang2() is vulnerable if allocations
within eval() give rise to garbage collection.

(Discovered during CXXR development.)

Andrew

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Paul Gilbert
One of the things I have noticed with the R 2.15.0 RC and --as-cran is 
that the I have to bump the version number of the working copy of my 
packages immediately after putting a version on CRAN, or I get an 
message about version suitability. This is probably a good thing for 
packages that I have changed, compared with my old habit of bumping the 
version number at arbitrary times, although the mechanics are a nuisance 
because I do not actually want to commit to the next version number at 
that point. For packages that I have not changed it is a bit worse, 
because I have to change the version number even though I have not yet 
made any changes to the package. This will mean, for example, that on 
R-forge it will look like there is a slightly newer version, even though 
there is not really.


I am curious how other developers approach this. Is it better to not 
specify --as-cran most of the time?  My feeling is that it is better to 
specify it all of the time so that I catch errors sooner rather than 
later, but maybe there is a better solution?


Paul

On 12-03-27 07:52 AM, Prof Brian Ripley wrote:

CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package maintainers.
In particular, please

- always send a submission email to c...@r-project.org with the package
name and version on the subject line. Emails sent to individual members
of the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it. Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are
able to give better diagnostics, e.g. for compiled code and especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages were
published last week) and to remain viable needs package maintainers to
make its life as easy as possible.

Kurt Hornik
Uwe Ligges
Brian Ripley

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Uwe Ligges



On 27.03.2012 16:17, Paul Gilbert wrote:

One of the things I have noticed with the R 2.15.0 RC and --as-cran is
that the I have to bump the version number of the working copy of my
packages immediately after putting a version on CRAN, or I get an
message about version suitability. This is probably a good thing for
packages that I have changed, compared with my old habit of bumping the
version number at arbitrary times, although the mechanics are a nuisance
because I do not actually want to commit to the next version number at
that point. For packages that I have not changed it is a bit worse,
because I have to change the version number even though I have not yet
made any changes to the package. This will mean, for example, that on
R-forge it will look like there is a slightly newer version, even though
there is not really.

I am curious how other developers approach this. Is it better to not
specify --as-cran most of the time? My feeling is that it is better to
specify it all of the time so that I catch errors sooner rather than
later, but maybe there is a better solution?



--as-cran is modelled rather closely after the CRAN incoming checks. 
CRAN checks if a new version has a new version number. Of course, you 
can ignore its result if you do not want to submit. The idea of using 
--as-cran is to apply it before you actually submit. Some parts require 
network connection etc.


Uwe





Paul

On 12-03-27 07:52 AM, Prof Brian Ripley wrote:

CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package maintainers.
In particular, please

- always send a submission email to c...@r-project.org with the package
name and version on the subject line. Emails sent to individual members
of the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it. Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are
able to give better diagnostics, e.g. for compiled code and especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages were
published last week) and to remain viable needs package maintainers to
make its life as easy as possible.

Kurt Hornik
Uwe Ligges
Brian Ripley

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Gabor Grothendieck
On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:
 CRAN has for some time had a policies page at
 http://cran.r-project.org/web/packages/policies.html
 and we would like to draw this to the attention of package maintainers.  In
 particular, please

 - always send a submission email to c...@r-project.org with the package
 name and version on the subject line.  Emails sent to individual members of
 the team will result in delays at best.

 - run R CMD check --as-cran on the tarball before you submit it.  Do
 this with the latest version of R possible: definitely R 2.14.2,
 preferably R 2.15.0 RC or a recent R-devel.  (Later versions of R are
 able to give better diagnostics, e.g. for compiled code and especially
 on Windows. They may also have extra checks for recently uncovered
 problems.)

 Also, please note that CRAN has a very heavy workload (186 packages were
 published last week) and to remain viable needs package maintainers to make
 its life as easy as possible.


Regarding the part about warnings or significant notes in that page,
its impossible to know which notes are significant and which ones are
not significant except by trial and error.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Paul Gilbert



On 12-03-27 10:59 AM, Uwe Ligges wrote:



On 27.03.2012 16:17, Paul Gilbert wrote:

One of the things I have noticed with the R 2.15.0 RC and --as-cran is
that the I have to bump the version number of the working copy of my
packages immediately after putting a version on CRAN, or I get an
message about version suitability. This is probably a good thing for
packages that I have changed, compared with my old habit of bumping the
version number at arbitrary times, although the mechanics are a nuisance
because I do not actually want to commit to the next version number at
that point. For packages that I have not changed it is a bit worse,
because I have to change the version number even though I have not yet
made any changes to the package. This will mean, for example, that on
R-forge it will look like there is a slightly newer version, even though
there is not really.

I am curious how other developers approach this. Is it better to not
specify --as-cran most of the time? My feeling is that it is better to
specify it all of the time so that I catch errors sooner rather than
later, but maybe there is a better solution?



--as-cran is modelled rather closely after the CRAN incoming checks.
CRAN checks if a new version has a new version number. Of course, you
can ignore its result if you do not want to submit. The idea of using
--as-cran is to apply it before you actually submit. Some parts require
network connection etc.

Uwe


Yes but, for example, will R-forge run checks with --as-cran, and thus 
give warnings for any package unchanged from the one on CRAN, or run 
without --as-cran, and thus not give a true indication of whether the 
package is good to submit?


(No doubt R-forge will customise more, but I am trying to work out a 
strategy for my own automatic testing.)


Paul






Paul

On 12-03-27 07:52 AM, Prof Brian Ripley wrote:

CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package maintainers.
In particular, please

- always send a submission email to c...@r-project.org with the package
name and version on the subject line. Emails sent to individual members
of the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it. Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are
able to give better diagnostics, e.g. for compiled code and especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages were
published last week) and to remain viable needs package maintainers to
make its life as easy as possible.

Kurt Hornik
Uwe Ligges
Brian Ripley

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Uwe Ligges



On 27.03.2012 17:22, Paul Gilbert wrote:



On 12-03-27 10:59 AM, Uwe Ligges wrote:



On 27.03.2012 16:17, Paul Gilbert wrote:

One of the things I have noticed with the R 2.15.0 RC and --as-cran is
that the I have to bump the version number of the working copy of my
packages immediately after putting a version on CRAN, or I get an
message about version suitability. This is probably a good thing for
packages that I have changed, compared with my old habit of bumping the
version number at arbitrary times, although the mechanics are a nuisance
because I do not actually want to commit to the next version number at
that point. For packages that I have not changed it is a bit worse,
because I have to change the version number even though I have not yet
made any changes to the package. This will mean, for example, that on
R-forge it will look like there is a slightly newer version, even though
there is not really.

I am curious how other developers approach this. Is it better to not
specify --as-cran most of the time? My feeling is that it is better to
specify it all of the time so that I catch errors sooner rather than
later, but maybe there is a better solution?



--as-cran is modelled rather closely after the CRAN incoming checks.
CRAN checks if a new version has a new version number. Of course, you
can ignore its result if you do not want to submit. The idea of using
--as-cran is to apply it before you actually submit. Some parts require
network connection etc.

Uwe


Yes but, for example, will R-forge run checks with --as-cran, and thus
give warnings for any package unchanged from the one on CRAN, or run
without --as-cran, and thus not give a true indication of whether the
package is good to submit?



This is a question for the R-forge maintainer. I would not expect it 
runs checks --as-cran, but I do now know.


Best,
Uwe




(No doubt R-forge will customise more, but I am trying to work out a
strategy for my own automatic testing.)

Paul






Paul

On 12-03-27 07:52 AM, Prof Brian Ripley wrote:

CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package maintainers.
In particular, please

- always send a submission email to c...@r-project.org with the package
name and version on the subject line. Emails sent to individual members
of the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it. Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are
able to give better diagnostics, e.g. for compiled code and especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages
were
published last week) and to remain viable needs package maintainers to
make its life as easy as possible.

Kurt Hornik
Uwe Ligges
Brian Ripley

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Uwe Ligges



On 27.03.2012 17:09, Gabor Grothendieck wrote:

On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley
rip...@stats.ox.ac.uk  wrote:

CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package maintainers.  In
particular, please

- always send a submission email to c...@r-project.org with the package
name and version on the subject line.  Emails sent to individual members of
the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it.  Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel.  (Later versions of R are
able to give better diagnostics, e.g. for compiled code and especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages were
published last week) and to remain viable needs package maintainers to make
its life as easy as possible.



Regarding the part about warnings or significant notes in that page,
its impossible to know which notes are significant and which ones are
not significant except by trial and error.



Right, it needs human inspection to identify false positives. We believe 
most package maintainers are able to see if he or she is hit by such a 
false positive.


Uwe Ligges

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Prof Brian Ripley

On 27/03/2012 15:17, Paul Gilbert wrote:

One of the things I have noticed with the R 2.15.0 RC and --as-cran is
that the I have to bump the version number of the working copy of my
packages immediately after putting a version on CRAN, or I get an
message about version suitability. This is probably a good thing for
packages that I have changed, compared with my old habit of bumping the
version number at arbitrary times, although the mechanics are a nuisance
because I do not actually want to commit to the next version number at
that point. For packages that I have not changed it is a bit worse,
because I have to change the version number even though I have not yet
made any changes to the package. This will mean, for example, that on
R-forge it will look like there is a slightly newer version, even though
there is not really.

I am curious how other developers approach this. Is it better to not
specify --as-cran most of the time? My feeling is that it is better to
specify it all of the time so that I catch errors sooner rather than
later, but maybe there is a better solution?


Yes.  It is only recommended for use just before submission.  It is not 
used by the CRAN daily checks, for example.


All it does it set some environment variables that you can also set in 
~/.R/check.Renviron, scripts ... and that is what the CRAN team do.  We 
introduced --as-cran to make it easier to explain to submitters how to 
get the check results we reported [*].


As for what the set is, read 'R Internals' or the code (it will vary by 
R version).


Given that we get several submissions per week with the same version 
number or name as a package already on CRAN, we do need submitters to 
run the 'incoming' check before submission.


[*] Since answering several emails a day about why their results were 
different was taking up far too much time.




Paul

On 12-03-27 07:52 AM, Prof Brian Ripley wrote:

CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package maintainers.
In particular, please

- always send a submission email to c...@r-project.org with the package
name and version on the subject line. Emails sent to individual members
of the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it. Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are
able to give better diagnostics, e.g. for compiled code and especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages were
published last week) and to remain viable needs package maintainers to
make its life as easy as possible.

Kurt Hornik
Uwe Ligges
Brian Ripley

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Gabor Grothendieck
2012/3/27 Uwe Ligges lig...@statistik.tu-dortmund.de:


 On 27.03.2012 17:09, Gabor Grothendieck wrote:

 On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley
 rip...@stats.ox.ac.uk  wrote:

 CRAN has for some time had a policies page at
 http://cran.r-project.org/web/packages/policies.html
 and we would like to draw this to the attention of package maintainers.
  In
 particular, please

 - always send a submission email to c...@r-project.org with the package
 name and version on the subject line.  Emails sent to individual members
 of
 the team will result in delays at best.

 - run R CMD check --as-cran on the tarball before you submit it.  Do
 this with the latest version of R possible: definitely R 2.14.2,
 preferably R 2.15.0 RC or a recent R-devel.  (Later versions of R are
 able to give better diagnostics, e.g. for compiled code and especially
 on Windows. They may also have extra checks for recently uncovered
 problems.)

 Also, please note that CRAN has a very heavy workload (186 packages were
 published last week) and to remain viable needs package maintainers to
 make
 its life as easy as possible.


 Regarding the part about warnings or significant notes in that page,
 its impossible to know which notes are significant and which ones are
 not significant except by trial and error.



 Right, it needs human inspection to identify false positives. We believe
 most package maintainers are able to see if he or she is hit by such a false
 positive.

The problem is that a note is generated and the note is correct. Its
not a false positive.  But that does not tell you whether its
significant or not.  There is no way to know.  One can either try to
remove all notes (which may not be feasible) or just upload it and by
trial and error find out if its accepted or not.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] .Call ref card

2012-03-27 Thread Simon Urbanek

On Mar 27, 2012, at 12:03 PM, Terry Therneau wrote:

 On 03/23/2012 10:58 AM, Simon Urbanek wrote:
 This is my shot at a cheat sheet.
 comments are welcome.
 
 Simon
 
 
 I was looking through the cheat sheet.  It's nice.  There are a few things in 
 it that I can't find in the documentation though.  Where would one find a 
 description?  (I can guess, but that may be dangerous).
 
 mkNamed

It is a shorthand for using allocVector and then setting names (which can be 
tedious). It's a simple way to create a result list/object (a very common thing 
to do):

SEXP res = PROTECT(mkNamed(VECSXP, (const char*[]) { foo, bar, }));
// fill res with SET_VECTOR_ELT(res, ..) 
setAttrib(res, R_ClassSymbol, mkString(myClass));
UNPROTECT(1);
return res;

Note that the sentinel is  (not not NULL as commonly used in other APIs). 
Also you don't specify the length because it is determined from the names.


 R_Naint   (I don't see quite how this differs from using NA_INTEGER to set a 
 result)

It doesn't really -- NA_INTEGER is defined to be R_NaInt. In theory NA_INTEGER 
being a macro could be a constant instead -- maybe for efficiency -- but 
currently it's not.


 R_PreserveObject, R_ReleaseObject   (Advantages/disadvantages wrt PRESERVE?)
 

I guess you mean wrt PROTECT? Preserve/Release is used for objects that you 
want to be globally preserved - i.e. they will survive exit from the function. 
In contrast, the protection stack is popped when you exit the function (both by 
error or success).

Cheers,
Simon

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] .Call ref card

2012-03-27 Thread Simon Urbanek
FWIW: I have put the (slightly updated) sheet at

http://r.research.att.com/man/R-API-cheat-sheet.pdf

Note that it is certainly incomplete - but that is intentional to a) to fit the 
space constraints and b) to show only the most basic things since we are 
talking about starting with .Call -- advanced users may need a different sheet 
but then they just go straight to the headers anyway ...

Cheers,
Simon




On Mar 27, 2012, at 12:20 PM, Simon Urbanek wrote:

 
 On Mar 27, 2012, at 12:03 PM, Terry Therneau wrote:
 
 On 03/23/2012 10:58 AM, Simon Urbanek wrote:
 This is my shot at a cheat sheet.
 comments are welcome.
 
 Simon
 
 
 I was looking through the cheat sheet.  It's nice.  There are a few things 
 in it that I can't find in the documentation though.  Where would one find a 
 description?  (I can guess, but that may be dangerous).
 
 mkNamed
 
 It is a shorthand for using allocVector and then setting names (which can be 
 tedious). It's a simple way to create a result list/object (a very common 
 thing to do):
 
SEXP res = PROTECT(mkNamed(VECSXP, (const char*[]) { foo, bar, }));
// fill res with SET_VECTOR_ELT(res, ..) 
setAttrib(res, R_ClassSymbol, mkString(myClass));
UNPROTECT(1);
return res;
 
 Note that the sentinel is  (not not NULL as commonly used in other APIs). 
 Also you don't specify the length because it is determined from the names.
 
 
 R_Naint   (I don't see quite how this differs from using NA_INTEGER to set a 
 result)
 
 It doesn't really -- NA_INTEGER is defined to be R_NaInt. In theory 
 NA_INTEGER being a macro could be a constant instead -- maybe for efficiency 
 -- but currently it's not.
 
 
 R_PreserveObject, R_ReleaseObject   (Advantages/disadvantages wrt PRESERVE?)
 
 
 I guess you mean wrt PROTECT? Preserve/Release is used for objects that you 
 want to be globally preserved - i.e. they will survive exit from the 
 function. In contrast, the protection stack is popped when you exit the 
 function (both by error or success).
 
 Cheers,
 Simon
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 
 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Missing Windows binary for R-2.15RC?

2012-03-27 Thread Dan Tenenbaum
On Sat, Mar 24, 2012 at 1:07 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 On 12-03-24 2:31 PM, Simon Urbanek wrote:


 On Mar 24, 2012, at 12:43 PM, Duncan Murdoch wrote:

 On 12-03-24 10:53 AM, Uwe Ligges wrote:



 On 24.03.2012 06:58, Daniel Nordlund wrote:

 -Original Message-
 From: Dan Tenenbaum [mailto:dtene...@fhcrc.org]
 Sent: Friday, March 23, 2012 5:48 PM
 To: Daniel Nordlund
 Cc: r-devel@r-project.org
 Subject: Re: [Rd] Missing Windows binary for R-2.15RC?

 On Fri, Mar 23, 2012 at 4:52 PM, Daniel Nordlund
 djnordl...@frontier.com    wrote:

 -Original Message-
 From: r-devel-boun...@r-project.org [mailto:r-devel-bounces@r-

 project.org]

 On Behalf Of Dan Tenenbaum
 Sent: Friday, March 23, 2012 12:21 PM
 To: r-devel@r-project.org
 Subject: [Rd] Missing Windows binary for R-2.15RC?

 Hi,

 The page
 http://cran.r-project.org/bin/windows/base/rtest.html
 has a link to:
 http://cran.r-project.org/bin/windows/base/R-2.15.0rc-win.exe

 However, clicking on that link gives a 404 Object not found' error.

 FYI.
 Dan


 I experienced the same error you did using the link you provided.

   However, if you use the CRAN mirror hosted by YOUR organization, you
 can
 get the file. :-)



 I don't think so:

 http://cran.fhcrc.org/bin/windows/base/R-2.15.0rc-win.exe

 gives me a 404 as well.

 Dan



 I didn't look closely enough at what you were asking for (RC versus
 beta).  R-2.15RC may not have been up-loaded yet.  However, I just
 downloaded it from the original link that was posted, so it appears to be
 available now.


 It may have happened that the scripts generated the webpages before the
 binary was built and checked (since beta became rc yesterday).


 Yes, they need manual tweaking at the conversion, and I did it after the
 first upload.

 If this happens again (which is pretty likely), you can manually download
 the previous version by editing the URL to put in alpha in place of
 beta, or beta in place of rc.


 ... or have a fixed name instead (on OS X we just use 2.15-branch which is
 unambiguous). For the record I find it extremely annoying that even the
 installation target name changes in the installer - I keep having to change
 it to R-2.15 all the time, because I don't see why you would want to have
 alpha/beta/rc/release of the same R version installed in separate
 directories by default  - but that may be just me ;). To a lesser degree the
 same applies to patch versions, but since those are released I could see an
 argument for that, even though in practice I think it is not useful either
 (because typically you just want to upgrade and not another copy).


 I'm neutral about the name changes, but I don't think any of this is enough
 of a problem to be worth the time to fix.  If someone else wants to do it,
 then I'd be happy to let you take over.


Thanks all of you for looking into this. Bioconductor usually needs
the binaries as soon as they are available so if there is a
sustainable way to solve this, we'd appreciate it very much.

Dan



 Duncan Murdoch

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Gabor Grothendieck
2012/3/27 Uwe Ligges lig...@statistik.tu-dortmund.de:


 On 27.03.2012 19:10, Jeffrey Ryan wrote:

 Is there a distinction as to NOTE vs. WARNING that is documented?  I've
 always assumed (wrongly?) that NOTES weren't an issue with publishing on
 CRAN, but that they may change to WARNINGS at some point.


 We won't kick packages off CRAN for Notes (but we will if Warnings are not
 fixed), but we may not accept new submissions with significant Notes.

Yes, I understand that but that does not really address the problem
that one has no idea of whether a Note is significant or not so the
only way to determine its significance is to submit your package and
see if its accepted or not.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Paul Gilbert
An associated problem, for the wish list, is that it would be nice for 
package developers to have a way to automatically distinguish between 
NOTEs that can usually be ignored (e.g. a package suggests a package 
that is not available for cross reference checks - I have several case 
where the suggested package depends on the package being built, so this 
NOTE occurs all the time), and NOTEs that are really pre-WARNINGS, so 
that one can flag these and spend time fixing them before they become a 
WARNING or ERROR. Perhaps two different kinds of notes?


(And, BTW, having been responsible for a certain amount of the
  [*] Since answering several emails a day about why their
  results were different was taking up far too much time.
I think --as-cran is great.)

Paul

On 12-03-27 02:19 PM, Uwe Ligges wrote:



On 27.03.2012 19:10, Jeffrey Ryan wrote:

Is there a distinction as to NOTE vs. WARNING that is documented? I've
always assumed (wrongly?) that NOTES weren't an issue with publishing on
CRAN, but that they may change to WARNINGS at some point.


We won't kick packages off CRAN for Notes (but we will if Warnings are
not fixed), but we may not accept new submissions with significant Notes.

Best,
Uwe Ligges




Is the process by which this happens documented somewhere?

Jeff

On 3/27/12 11:09 AM, Gabor Grothendieckggrothendi...@gmail.com wrote:


2012/3/27 Uwe Liggeslig...@statistik.tu-dortmund.de:



On 27.03.2012 17:09, Gabor Grothendieck wrote:


On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:


CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package
maintainers.
In
particular, please

- always send a submission email to c...@r-project.org with the
package
name and version on the subject line. Emails sent to individual
members
of
the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it. Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are
able to give better diagnostics, e.g. for compiled code and
especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages
were
published last week) and to remain viable needs package
maintainers to
make
its life as easy as possible.



Regarding the part about warnings or significant notes in that page,
its impossible to know which notes are significant and which ones are
not significant except by trial and error.




Right, it needs human inspection to identify false positives. We
believe
most package maintainers are able to see if he or she is hit by such a
false
positive.


The problem is that a note is generated and the note is correct. Its
not a false positive. But that does not tell you whether its
significant or not. There is no way to know. One can either try to
remove all notes (which may not be feasible) or just upload it and by
trial and error find out if its accepted or not.

--
Statistics Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] serialization regression in 2.15.0 beta

2012-03-27 Thread Ben Goodrich
In case anyone is concerned that this regression will affect them, the
code was reverted to the 2.14.x behavior by


r58842 | ripley | 2012-03-26 08:12:43 -0400 (Mon, 26 Mar 2012) | 1 line
Changed paths:
   M /branches/R-2-15-branch/doc/NEWS.Rd
   M /branches/R-2-15-branch/src/library/parallel/R/unix/forkCluster.R
   M /branches/R-2-15-branch/src/library/parallel/R/unix/mcfork.R

revert to XDR serialization for 2.15.0


Thanks,
Ben


 I am experiencing a problem related to serialization behavior in  
 2.15.0 beta (binary installed from Debian unstable) and 2.16.0 (from  
 svn) that is not present in 2.14.2 (binary from Debian testing).
 
 I don't fully understand the problem. Also, I tried but have not yet  
 been able to create a small, self-contained example that reproduces  
 the problem. However, I do have a large, not self-contained example,  
 which requires an alpha version (not yet on CRAN) of the mi package  
 (the mi package on CRAN would not exhibit this issue). Anyone  
 interested in reproducing the problem can follow the readme.txt file  
 in this directory:
 
 http://www.columbia.edu/~bg2382/mi/serialization/
 
 I track r-devel with git-svn and was able to git bisect to svn commit r58219
 
 commit 799102bd9d0266fe89c3120981decf0b1f17ef11
 Author: ripley ripley at 00db46b3-68df-0310-9c12-caf00c1e9a41
 Date:   Sat Jan 28 15:02:34 2012 +
 
  make use of non-xdr serialization;.
 
 although this commit could merely expose the problem rather than cause it.
 
 The problem occurs when the FUN called by mclapply() in the parallel  
 package returns a S4 object that contains a slot (called X) that is a  
 large matrix, specifically a model matrix similar to that produced  
 by glm(). Some columns of this matrix get corrupted with wrong values  
 (usually zero, but sometimes NaN or 10^300ish), which can be seen by  
 examining X right before FUN returns (to mclapply()'s environment) and  
 comparing to the same X after mclapply() returns to the calling  
 environment.
 
 Part of svn commit r58219 is this hunk
 
 diff --git a/src/library/parallel/R/unix/mcfork.R  
 b/src/library/parallel/R/unix/mcfork.R
 index 8e27534..4f92193 100644
 --- a/src/library/parallel/R/unix/mcfork.R
 +++ b/src/library/parallel/R/unix/mcfork.R
 @@ -82,7 +82,8 @@ mckill - function(process, signal = 2L)
   ## used by mcparallel, mclapply
   sendMaster - function(what)
   {
 -if (!is.raw(what)) what - serialize(what, NULL, FALSE)
 +# This is talking to the same machine, so no point in using xdr.
 +if (!is.raw(what)) what - serialize(what, NULL, xdr = FALSE)
   .Call(C_mc_send_master, what, PACKAGE = parallel)
   }
 
 Contrary to the comment, I have found that if I specify xdr = TRUE, I  
 get the expected (non-corrupted X slot) behavior in 2.16.0, even  
 though it is forking locally on my 64bit Debian laptop with a little  
 endian i7 processor, whose specs are
 
 goodrich at CYBERPOWERPC:/tmp/serialization$ cat /proc/cpuinfo
 processor   : 0
 vendor_id   : GenuineIntel
 cpu family  : 6
 model   : 42
 model name  : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
 stepping: 7
 microcode   : 0x17
 cpu MHz : 800.000
 cache size  : 6144 KB
 physical id : 0
 siblings: 8
 core id : 0
 cpu cores   : 4
 apicid  : 0
 initial apicid  : 0
 fpu : yes
 fpu_exception   : yes
 cpuid level : 13
 wp  : yes
 flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge  
 mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe  
 syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl  
 xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl  
 vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt  
 tsc_deadline_timer xsave avx lahf_lm ida arat epb xsaveopt pln pts dts  
 tpr_shadow vnmi flexpriority ept vpid
 bogomips: 3990.83
 clflush size: 64
 cache_alignment : 64
 address sizes   : 36 bits physical, 48 bits virtual
 power management:
 
 ...
 
 processor   : 7
 [same as processor 0]
 
 So, to summarize I get the good behavior on R 2.14.2 when using  
 mclapply(), on 2.15.0 beta when using lapply(), and on 2.16.0 using  
 mclapply() iff I patch in xdr = TRUE in sendMaster(). I get the bad  
 behavior on 2.15.0 beta and unpatched 2.16.0 when using mclapply().
 
 My session info:
 
 sessionInfo()
 R version 2.15.0 beta (2012-03-16 r58769)
 Platform: x86_64-pc-linux-gnu (64-bit)
 
 locale:
   [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
   [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
   [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
   [7] LC_PAPER=C LC_NAME=C
   [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
 
 attached 

Re: [Rd] CRAN policies

2012-03-27 Thread Hadley Wickham
 I have been wondering if it is possible to automate the checking
 process to reduce human efforts, e.g. automatically check the packages
 submitted to FTP, and send the package maintainer an email in case of
 warnings or errors (otherwise just move it to CRAN); package
 maintainers can appeal for a manual check by CRAN maintainers in case
 of false positives.

I've started using win-builder before submitting to CRAN.  This often
picks up problems that I don't see locally.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Hadley Wickham
On Tue, Mar 27, 2012 at 6:52 AM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:
 CRAN has for some time had a policies page at
 http://cran.r-project.org/web/packages/policies.html
 and we would like to draw this to the attention of package maintainers.  In
 particular, please

Thanks for the pointer - I did not know that this page existed. In
general, is there some easy way to track changes to this page and the
R extension manual over time?  It is difficult to keep track of the
best practices.

I'd also like to get clarification on Packages should not write in
the users' home filespace, nor anywhere else on the file system apart
from the R session's temporary directory (or during installation in
the location pointed to by TMPDIR: and such usage should be cleaned
up). - what is recommended practice for packages to maintain state
across instances?  Operating systems have standards for where
applications can store settings (e.g. as described in
http://pypi.python.org/pypi/appdirs/1.2.0).  Is it acceptable to for
packages to follow these conventions?

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN policies

2012-03-27 Thread Murray Stokely
Lots of very sensible policies here.  I have one request as someone
who has in several cases had to involve company lawyers over
intellectual property issues with packages on CRAN -- the first bullet
point on ownership of copyright and intellectual property rights could
be strengthened further.

To the existing text The ownership of copyright and intellectual
property rights of all components of the package must be clear and
unambiguous (including from the authors specification in the
DESCRIPTION file). Where code is copied (or derived) from the work of
others (including from R itself), care must be taken that any
copyright statements are preserved and authorship is not
misrepresented.
Trademarks must be respected.

I would add a few additional points :

1. The text of the license itself should be included in the package in
a LICENSE or COPYING file, as most of these licenses have things that
need to be filled in with names and other data, and just referencing a
license name in the DESCRIPTION file is not really a great way to deal
with licensing metadata when used exclusively (it's a great complement
to a full, filled-out license in the package itself).

2. Per file copyright comment headers can help immensely with ensuring
compliance and the accidental incorporation of files under a different
license.  Comment header blocks with the author name and terms of
distribution could be recommended for all source files.

   - Murray

On Tue, Mar 27, 2012 at 4:52 AM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:
 CRAN has for some time had a policies page at
 http://cran.r-project.org/web/packages/policies.html
 and we would like to draw this to the attention of package maintainers.  In
 particular, please

 - always send a submission email to c...@r-project.org with the package
 name and version on the subject line.  Emails sent to individual members of
 the team will result in delays at best.

 - run R CMD check --as-cran on the tarball before you submit it.  Do
 this with the latest version of R possible: definitely R 2.14.2,
 preferably R 2.15.0 RC or a recent R-devel.  (Later versions of R are
 able to give better diagnostics, e.g. for compiled code and especially
 on Windows. They may also have extra checks for recently uncovered
 problems.)

 Also, please note that CRAN has a very heavy workload (186 packages were
 published last week) and to remain viable needs package maintainers to make
 its life as easy as possible.

 Kurt Hornik
 Uwe Ligges
 Brian Ripley

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] serialization regression in 2.15.0 beta

2012-03-27 Thread Prof Brian Ripley

On 27/03/2012 22:01, Ben Goodrich wrote:

In case anyone is concerned that this regression will affect them, the
code was reverted to the 2.14.x behavior by


r58842 | ripley | 2012-03-26 08:12:43 -0400 (Mon, 26 Mar 2012) | 1 line
Changed paths:
M /branches/R-2-15-branch/doc/NEWS.Rd
M /branches/R-2-15-branch/src/library/parallel/R/unix/forkCluster.R
M /branches/R-2-15-branch/src/library/parallel/R/unix/mcfork.R

revert to XDR serialization for 2.15.0



But the underlying problem (in non-xdr binary unserialization) is AFAWK 
fixed: it was just that at this late stage there was too little time to 
test thoroughly before release.


Please test R-devel on your own problem (we haven't: the issue was found 
using a different example from elsewhere).



Thanks,
Ben



I am experiencing a problem related to serialization behavior in
2.15.0 beta (binary installed from Debian unstable) and 2.16.0 (from
svn) that is not present in 2.14.2 (binary from Debian testing).

I don't fully understand the problem. Also, I tried but have not yet
been able to create a small, self-contained example that reproduces
the problem. However, I do have a large, not self-contained example,
which requires an alpha version (not yet on CRAN) of the mi package
(the mi package on CRAN would not exhibit this issue). Anyone
interested in reproducing the problem can follow the readme.txt file
in this directory:

http://www.columbia.edu/~bg2382/mi/serialization/

I track r-devel with git-svn and was able to git bisect to svn commit r58219

commit 799102bd9d0266fe89c3120981decf0b1f17ef11
Author: ripleyripley at 00db46b3-68df-0310-9c12-caf00c1e9a41
Date:   Sat Jan 28 15:02:34 2012 +

  make use of non-xdr serialization;.

although this commit could merely expose the problem rather than cause it.

The problem occurs when the FUN called by mclapply() in the parallel
package returns a S4 object that contains a slot (called X) that is a
large matrix, specifically a model matrix similar to that produced
by glm(). Some columns of this matrix get corrupted with wrong values
(usually zero, but sometimes NaN or 10^300ish), which can be seen by
examining X right before FUN returns (to mclapply()'s environment) and
comparing to the same X after mclapply() returns to the calling
environment.

Part of svn commit r58219 is this hunk

diff --git a/src/library/parallel/R/unix/mcfork.R
b/src/library/parallel/R/unix/mcfork.R
index 8e27534..4f92193 100644
--- a/src/library/parallel/R/unix/mcfork.R
+++ b/src/library/parallel/R/unix/mcfork.R
@@ -82,7 +82,8 @@ mckill- function(process, signal = 2L)
   ## used by mcparallel, mclapply
   sendMaster- function(what)
   {
-if (!is.raw(what)) what- serialize(what, NULL, FALSE)
+# This is talking to the same machine, so no point in using xdr.
+if (!is.raw(what)) what- serialize(what, NULL, xdr = FALSE)
   .Call(C_mc_send_master, what, PACKAGE = parallel)
   }

Contrary to the comment, I have found that if I specify xdr = TRUE, I
get the expected (non-corrupted X slot) behavior in 2.16.0, even
though it is forking locally on my 64bit Debian laptop with a little
endian i7 processor, whose specs are

goodrich at CYBERPOWERPC:/tmp/serialization$ cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 42
model name  : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
stepping: 7
microcode   : 0x17
cpu MHz : 800.000
cache size  : 6144 KB
physical id : 0
siblings: 8
core id : 0
cpu cores   : 4
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl
vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt
tsc_deadline_timer xsave avx lahf_lm ida arat epb xsaveopt pln pts dts
tpr_shadow vnmi flexpriority ept vpid
bogomips: 3990.83
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

...

processor   : 7
[same as processor 0]

So, to summarize I get the good behavior on R 2.14.2 when using
mclapply(), on 2.15.0 beta when using lapply(), and on 2.16.0 using
mclapply() iff I patch in xdr = TRUE in sendMaster(). I get the bad
behavior on 2.15.0 beta and unpatched 2.16.0 when using mclapply().

My session info:


sessionInfo()

R version 2.15.0 beta (2012-03-16 r58769)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
   [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
   [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
   [5]