Re: [Rd] Problem with table
On 19/03/2012 17:01, Terry Therneau wrote: R version 2.14.0, started with --vanilla table(c(1,2,3,4,NA), exclude=2, useNA='ifany') 1 3 4 NA 1 1 1 2 This came from a local user who wanted to remove one particular response from some tables, but also wants to have NA always reported for data checking purposes. I don't think the above is what anyone would want. You have not told us what you want! Try table(as.factor(c(1,2,3,4,NA)), exclude=2, useNA='ifany') 134 NA 1111 Note carefully how 'exclude' is defined: exclude: levels to remove from all factors in ‘...’. If set to ‘NULL’, it implies ‘useNA=always’. As you did not specify a factor, 'exclude' was used in forming the 'levels'. PS. This is on a background of our local desires, which is to have the default action of the table command be to report NA, if present. (It's one of the only commands that we globally override at Mayo.) The user had added only the exclude=2 argument, and the useNA value is our default. The above makes this harder to do without rewriting the command wholesale, which is ok (we've done it before at various times in R and Splus) but we would avoid it if possible. Please no wars about whether this is the right decison or not, we've done it for 10+ years and quite firmly believe the extra robustness gained by having NA appear is worth the maintainance bother, correctness being paramount in medical research. We're not trying to convert anyone else, just get feedback on the best way to approach this. Most likely, feed table() a factor with the properties you want. Terry T. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] drawing the graph with many nodes
Good morning, I'm trying to draw a graph, and I'm using the following code. test.matrix-read.table(~/Desktop/Results/testgephi.csv, header = T, sep=,) colnames(test.matrix) - gsub(X, , colnames(test.matrix)) #drop first column drops - c() test.matrix-test.matrix[,!(names(test.matrix) %in% drops)] test.matrix test.matrix-data.matrix(test.matrix) am.graph-new(graphAM, adjMat=test.matrix, edgemode=directed) am.graph plot(am.graph, attrs = list(node = list(fillcolor = lightblue),edge = list(arrowsize=0.5))) The file testgephi.csv is following. ,1,2,3,4,5 1,393,55,66,44,88 2,44,23,47,57,89 3,57,87,98,456,43 4,77,767,86,32,77 5,43,88,23,76,46 In the example graph of the drawing works well, the problem is when I'm trying to draw the graph from a file wih A graphAM graph with directed edges Number of Nodes = 217 Number of Edges = 32804 is there any package or tool that can draw a structure like this Thanks -- View this message in context: http://r.789695.n4.nabble.com/drawing-the-graph-with-many-nodes-tp4508319p4508319.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] drawing the graph with many nodes
if you don't mind going outside of R to create it, then check out Graphviz: http://www.graphviz.org/Gallery.php you may have to reformat your data a little, but this tool is great for drawing graphs. -Whit On Tue, Mar 27, 2012 at 5:14 AM, MSousa ricardosousa2...@clix.pt wrote: Good morning, I'm trying to draw a graph, and I'm using the following code. test.matrix-read.table(~/Desktop/Results/testgephi.csv, header = T, sep=,) colnames(test.matrix) - gsub(X, , colnames(test.matrix)) #drop first column drops - c() test.matrix-test.matrix[,!(names(test.matrix) %in% drops)] test.matrix test.matrix-data.matrix(test.matrix) am.graph-new(graphAM, adjMat=test.matrix, edgemode=directed) am.graph plot(am.graph, attrs = list(node = list(fillcolor = lightblue),edge = list(arrowsize=0.5))) The file testgephi.csv is following. ,1,2,3,4,5 1,393,55,66,44,88 2,44,23,47,57,89 3,57,87,98,456,43 4,77,767,86,32,77 5,43,88,23,76,46 In the example graph of the drawing works well, the problem is when I'm trying to draw the graph from a file wih A graphAM graph with directed edges Number of Nodes = 217 Number of Edges = 32804 is there any package or tool that can draw a structure like this Thanks -- View this message in context: http://r.789695.n4.nabble.com/drawing-the-graph-with-many-nodes-tp4508319p4508319.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] CRAN policies
CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Typo in ?Logistic
R Michael Weylandt michael.weyla...@gmail.com on Mon, 26 Mar 2012 14:29:31 -0400 writes: In the source section of ?rlogis, we see: Source: ‘[dpr]logis’ are calculated directly from the definitions. ‘rlogis’ uses inversion. Should that read [dpq]logis instead? yes, indeed; now fixed. Thank you very much! Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] drawing the graph with many nodes
Hi, your code does not run in a fresh R environment: Error in getClass(Class, where = topenv(parent.frame())) : graphAM is not a defined class If you don't provide working code, it's (too) much effort to help. There are some graph packages arround. Which you need depends on what you want to do. I can't decide that easily, without seeing your example running. Ciao, Oliver __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Problem with table
On 03/27/2012 02:05 AM, Prof Brian Ripley wrote: n 19/03/2012 17:01, Terry Therneau wrote: R version 2.14.0, started with --vanilla table(c(1,2,3,4,NA), exclude=2, useNA='ifany') 1 3 4 NA 1 1 1 2 This came from a local user who wanted to remove one particular response from some tables, but also wants to have NA always reported for data checking purposes. I don't think the above is what anyone would want. You have not told us what you want! Want: that the resulting table exclude values of 2 from the printout, while still reporting NA. This is what the local user expected, the one who came to me with their query. There are lots of ways to get the program to do the right thing, the simplest is table(c(1,2,3,4,NA), exclude=2) # keeping the default for useNA You show another below. Try table(as.factor(c(1,2,3,4,NA)), exclude=2, useNA='ifany') 134 NA 1111 Note carefully how 'exclude' is defined: exclude: levels to remove from all factors in ‘...’. If set to ‘NULL’, it implies ‘useNA=always’. As you did not specify a factor, 'exclude' was used in forming the 'levels'. That is almost a legal loophole reading of the manual. I would never have seen through to that level of subtlety. A primary reason is that a simple test shows that exclude works on non-factors. I'm not sure what the best course of action is. What I've reported is a case where use of the options in a fairly obvious way gives an unexpected answer. On the other hand, I have never before seen or considered the case where someone wanted to exclude an actual data level from table: I myself would always have removed a column from the result. If fixing this causes other problems, then perhaps we just give up on this rare case. As to our local choices, we figured out a way to make display of NA the default without causing the above problem. As is often the case, a fairly simple solution became obvious to us about 30 minutes after submitting a question to the list. Terry T. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] PROTECT help
I received the following note this AM. The problem is, I'm not quite sure how to fix it. Can one use PROTECT(coxlist(eval(PROTECT , do I create an intermediate variable, or otherwise? I'm willing to update the code if someone will give me a pointer to the right documentation. This particular chunk was written when there was a lot of change going on in the callback mechanism and so there might be a safer and/or simpler and/or more standard aproach by now. The routine in question has to do with penalized Cox models, the C code needs to get the value of the penalty and the penalty is an arbitrary S expression passed down from top level. Terry T In survival_2.36-12 (and earlier), in the function cox_callback() at cox_Rcallback.c:40: PROTECT(coxlist=eval(lang2(fexpr,data),rho)); the return value of the call to lang2() is vulnerable if allocations within eval() give rise to garbage collection. (Discovered during CXXR development.) Andrew __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] PROTECT help
On 27/03/2012 14:22, Terry Therneau wrote: I received the following note this AM. The problem is, I'm not quite sure how to fix it. Can one use PROTECT(coxlist(eval(PROTECT , do I create an intermediate variable, or otherwise? You can, but I find it easiest to follow if you create an intermediate variable. Look for example at unique.c: SEXP call, r; PROTECT(call = lang2(install(as.character), s)); PROTECT(r = eval(call, env)); UNPROTECT(2); return r; I'm willing to update the code if someone will give me a pointer to the right documentation. This particular chunk was written when there was a lot of change going on in the callback mechanism and so there might be a safer and/or simpler and/or more standard aproach by now. The routine in question has to do with penalized Cox models, the C code needs to get the value of the penalty and the penalty is an arbitrary S expression passed down from top level. Terry T In survival_2.36-12 (and earlier), in the function cox_callback() at cox_Rcallback.c:40: PROTECT(coxlist=eval(lang2(fexpr,data),rho)); the return value of the call to lang2() is vulnerable if allocations within eval() give rise to garbage collection. (Discovered during CXXR development.) Andrew __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] PROTECT help
On 12-03-27 9:22 AM, Terry Therneau wrote: I received the following note this AM. The problem is, I'm not quite sure how to fix it. Can one use PROTECT(coxlist(eval(PROTECT , do I create an intermediate variable, or otherwise? I think both would work. The usual style in R sources is to use an intermediate variable, assigned within the PROTECT call, e.g. PROTECT(var = f()); but this would act the same as var = PROTECT(f()); I don't know where the best docs are, but here is my understanding of PROTECT: What PROTECT(x) does is to make a copy of the pointer x in a stack of protected pointers. When garbage collection happens, nothing in that stack will be released. It is safe to protect things that don't need protection, but it is a little inefficient. (You shouldn't call PROTECT() on a pointer that isn't an R object declared as a SEXP, but it will only cause trouble in certain debugging modes.) PROTECT(x) does return the value of x, so f(PROTECT(x)) should evaluate the same as f(x) (but x will be protected from collection). The main thing to watch when you use PROTECT is that you keep track of how many times it is called, because UNPROTECT just pops a number of pointers off the protection stack. I'm willing to update the code if someone will give me a pointer to the right documentation. This particular chunk was written when there was a lot of change going on in the callback mechanism and so there might be a safer and/or simpler and/or more standard aproach by now. The routine in question has to do with penalized Cox models, the C code needs to get the value of the penalty and the penalty is an arbitrary S expression passed down from top level. Terry T In survival_2.36-12 (and earlier), in the function cox_callback() at cox_Rcallback.c:40: PROTECT(coxlist=eval(lang2(fexpr,data),rho)); the return value of the call to lang2() is vulnerable if allocations within eval() give rise to garbage collection. (Discovered during CXXR development.) Andrew __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] PROTECT help
Brian Duncan: Thanks. This was exactly what I needed to know. Terry On 03/27/2012 08:41 AM, Prof Brian Ripley wrote: On 27/03/2012 14:22, Terry Therneau wrote: I received the following note this AM. The problem is, I'm not quite sure how to fix it. Can one use PROTECT(coxlist(eval(PROTECT , do I create an intermediate variable, or otherwise? You can, but I find it easiest to follow if you create an intermediate variable. Look for example at unique.c: SEXP call, r; PROTECT(call = lang2(install(as.character), s)); PROTECT(r = eval(call, env)); UNPROTECT(2); return r; I'm willing to update the code if someone will give me a pointer to the right documentation. This particular chunk was written when there was a lot of change going on in the callback mechanism and so there might be a safer and/or simpler and/or more standard aproach by now. The routine in question has to do with penalized Cox models, the C code needs to get the value of the penalty and the penalty is an arbitrary S expression passed down from top level. Terry T In survival_2.36-12 (and earlier), in the function cox_callback() at cox_Rcallback.c:40: PROTECT(coxlist=eval(lang2(fexpr,data),rho)); the return value of the call to lang2() is vulnerable if allocations within eval() give rise to garbage collection. (Discovered during CXXR development.) Andrew __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27.03.2012 16:17, Paul Gilbert wrote: One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? --as-cran is modelled rather closely after the CRAN incoming checks. CRAN checks if a new version has a new version number. Of course, you can ignore its result if you do not want to submit. The idea of using --as-cran is to apply it before you actually submit. Some parts require network connection etc. Uwe Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Regarding the part about warnings or significant notes in that page, its impossible to know which notes are significant and which ones are not significant except by trial and error. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 12-03-27 10:59 AM, Uwe Ligges wrote: On 27.03.2012 16:17, Paul Gilbert wrote: One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? --as-cran is modelled rather closely after the CRAN incoming checks. CRAN checks if a new version has a new version number. Of course, you can ignore its result if you do not want to submit. The idea of using --as-cran is to apply it before you actually submit. Some parts require network connection etc. Uwe Yes but, for example, will R-forge run checks with --as-cran, and thus give warnings for any package unchanged from the one on CRAN, or run without --as-cran, and thus not give a true indication of whether the package is good to submit? (No doubt R-forge will customise more, but I am trying to work out a strategy for my own automatic testing.) Paul Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27.03.2012 17:22, Paul Gilbert wrote: On 12-03-27 10:59 AM, Uwe Ligges wrote: On 27.03.2012 16:17, Paul Gilbert wrote: One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? --as-cran is modelled rather closely after the CRAN incoming checks. CRAN checks if a new version has a new version number. Of course, you can ignore its result if you do not want to submit. The idea of using --as-cran is to apply it before you actually submit. Some parts require network connection etc. Uwe Yes but, for example, will R-forge run checks with --as-cran, and thus give warnings for any package unchanged from the one on CRAN, or run without --as-cran, and thus not give a true indication of whether the package is good to submit? This is a question for the R-forge maintainer. I would not expect it runs checks --as-cran, but I do now know. Best, Uwe (No doubt R-forge will customise more, but I am trying to work out a strategy for my own automatic testing.) Paul Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27.03.2012 17:09, Gabor Grothendieck wrote: On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Regarding the part about warnings or significant notes in that page, its impossible to know which notes are significant and which ones are not significant except by trial and error. Right, it needs human inspection to identify false positives. We believe most package maintainers are able to see if he or she is hit by such a false positive. Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 27/03/2012 15:17, Paul Gilbert wrote: One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? Yes. It is only recommended for use just before submission. It is not used by the CRAN daily checks, for example. All it does it set some environment variables that you can also set in ~/.R/check.Renviron, scripts ... and that is what the CRAN team do. We introduced --as-cran to make it easier to explain to submitters how to get the check results we reported [*]. As for what the set is, read 'R Internals' or the code (it will vary by R version). Given that we get several submissions per week with the same version number or name as a package already on CRAN, we do need submitters to run the 'incoming' check before submission. [*] Since answering several emails a day about why their results were different was taking up far too much time. Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
2012/3/27 Uwe Ligges lig...@statistik.tu-dortmund.de: On 27.03.2012 17:09, Gabor Grothendieck wrote: On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Regarding the part about warnings or significant notes in that page, its impossible to know which notes are significant and which ones are not significant except by trial and error. Right, it needs human inspection to identify false positives. We believe most package maintainers are able to see if he or she is hit by such a false positive. The problem is that a note is generated and the note is correct. Its not a false positive. But that does not tell you whether its significant or not. There is no way to know. One can either try to remove all notes (which may not be feasible) or just upload it and by trial and error find out if its accepted or not. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] .Call ref card
On Mar 27, 2012, at 12:03 PM, Terry Therneau wrote: On 03/23/2012 10:58 AM, Simon Urbanek wrote: This is my shot at a cheat sheet. comments are welcome. Simon I was looking through the cheat sheet. It's nice. There are a few things in it that I can't find in the documentation though. Where would one find a description? (I can guess, but that may be dangerous). mkNamed It is a shorthand for using allocVector and then setting names (which can be tedious). It's a simple way to create a result list/object (a very common thing to do): SEXP res = PROTECT(mkNamed(VECSXP, (const char*[]) { foo, bar, })); // fill res with SET_VECTOR_ELT(res, ..) setAttrib(res, R_ClassSymbol, mkString(myClass)); UNPROTECT(1); return res; Note that the sentinel is (not not NULL as commonly used in other APIs). Also you don't specify the length because it is determined from the names. R_Naint (I don't see quite how this differs from using NA_INTEGER to set a result) It doesn't really -- NA_INTEGER is defined to be R_NaInt. In theory NA_INTEGER being a macro could be a constant instead -- maybe for efficiency -- but currently it's not. R_PreserveObject, R_ReleaseObject (Advantages/disadvantages wrt PRESERVE?) I guess you mean wrt PROTECT? Preserve/Release is used for objects that you want to be globally preserved - i.e. they will survive exit from the function. In contrast, the protection stack is popped when you exit the function (both by error or success). Cheers, Simon __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] .Call ref card
FWIW: I have put the (slightly updated) sheet at http://r.research.att.com/man/R-API-cheat-sheet.pdf Note that it is certainly incomplete - but that is intentional to a) to fit the space constraints and b) to show only the most basic things since we are talking about starting with .Call -- advanced users may need a different sheet but then they just go straight to the headers anyway ... Cheers, Simon On Mar 27, 2012, at 12:20 PM, Simon Urbanek wrote: On Mar 27, 2012, at 12:03 PM, Terry Therneau wrote: On 03/23/2012 10:58 AM, Simon Urbanek wrote: This is my shot at a cheat sheet. comments are welcome. Simon I was looking through the cheat sheet. It's nice. There are a few things in it that I can't find in the documentation though. Where would one find a description? (I can guess, but that may be dangerous). mkNamed It is a shorthand for using allocVector and then setting names (which can be tedious). It's a simple way to create a result list/object (a very common thing to do): SEXP res = PROTECT(mkNamed(VECSXP, (const char*[]) { foo, bar, })); // fill res with SET_VECTOR_ELT(res, ..) setAttrib(res, R_ClassSymbol, mkString(myClass)); UNPROTECT(1); return res; Note that the sentinel is (not not NULL as commonly used in other APIs). Also you don't specify the length because it is determined from the names. R_Naint (I don't see quite how this differs from using NA_INTEGER to set a result) It doesn't really -- NA_INTEGER is defined to be R_NaInt. In theory NA_INTEGER being a macro could be a constant instead -- maybe for efficiency -- but currently it's not. R_PreserveObject, R_ReleaseObject (Advantages/disadvantages wrt PRESERVE?) I guess you mean wrt PROTECT? Preserve/Release is used for objects that you want to be globally preserved - i.e. they will survive exit from the function. In contrast, the protection stack is popped when you exit the function (both by error or success). Cheers, Simon __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Missing Windows binary for R-2.15RC?
On Sat, Mar 24, 2012 at 1:07 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 12-03-24 2:31 PM, Simon Urbanek wrote: On Mar 24, 2012, at 12:43 PM, Duncan Murdoch wrote: On 12-03-24 10:53 AM, Uwe Ligges wrote: On 24.03.2012 06:58, Daniel Nordlund wrote: -Original Message- From: Dan Tenenbaum [mailto:dtene...@fhcrc.org] Sent: Friday, March 23, 2012 5:48 PM To: Daniel Nordlund Cc: r-devel@r-project.org Subject: Re: [Rd] Missing Windows binary for R-2.15RC? On Fri, Mar 23, 2012 at 4:52 PM, Daniel Nordlund djnordl...@frontier.com wrote: -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-bounces@r- project.org] On Behalf Of Dan Tenenbaum Sent: Friday, March 23, 2012 12:21 PM To: r-devel@r-project.org Subject: [Rd] Missing Windows binary for R-2.15RC? Hi, The page http://cran.r-project.org/bin/windows/base/rtest.html has a link to: http://cran.r-project.org/bin/windows/base/R-2.15.0rc-win.exe However, clicking on that link gives a 404 Object not found' error. FYI. Dan I experienced the same error you did using the link you provided. However, if you use the CRAN mirror hosted by YOUR organization, you can get the file. :-) I don't think so: http://cran.fhcrc.org/bin/windows/base/R-2.15.0rc-win.exe gives me a 404 as well. Dan I didn't look closely enough at what you were asking for (RC versus beta). R-2.15RC may not have been up-loaded yet. However, I just downloaded it from the original link that was posted, so it appears to be available now. It may have happened that the scripts generated the webpages before the binary was built and checked (since beta became rc yesterday). Yes, they need manual tweaking at the conversion, and I did it after the first upload. If this happens again (which is pretty likely), you can manually download the previous version by editing the URL to put in alpha in place of beta, or beta in place of rc. ... or have a fixed name instead (on OS X we just use 2.15-branch which is unambiguous). For the record I find it extremely annoying that even the installation target name changes in the installer - I keep having to change it to R-2.15 all the time, because I don't see why you would want to have alpha/beta/rc/release of the same R version installed in separate directories by default - but that may be just me ;). To a lesser degree the same applies to patch versions, but since those are released I could see an argument for that, even though in practice I think it is not useful either (because typically you just want to upgrade and not another copy). I'm neutral about the name changes, but I don't think any of this is enough of a problem to be worth the time to fix. If someone else wants to do it, then I'd be happy to let you take over. Thanks all of you for looking into this. Bioconductor usually needs the binaries as soon as they are available so if there is a sustainable way to solve this, we'd appreciate it very much. Dan Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
2012/3/27 Uwe Ligges lig...@statistik.tu-dortmund.de: On 27.03.2012 19:10, Jeffrey Ryan wrote: Is there a distinction as to NOTE vs. WARNING that is documented? I've always assumed (wrongly?) that NOTES weren't an issue with publishing on CRAN, but that they may change to WARNINGS at some point. We won't kick packages off CRAN for Notes (but we will if Warnings are not fixed), but we may not accept new submissions with significant Notes. Yes, I understand that but that does not really address the problem that one has no idea of whether a Note is significant or not so the only way to determine its significance is to submit your package and see if its accepted or not. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
An associated problem, for the wish list, is that it would be nice for package developers to have a way to automatically distinguish between NOTEs that can usually be ignored (e.g. a package suggests a package that is not available for cross reference checks - I have several case where the suggested package depends on the package being built, so this NOTE occurs all the time), and NOTEs that are really pre-WARNINGS, so that one can flag these and spend time fixing them before they become a WARNING or ERROR. Perhaps two different kinds of notes? (And, BTW, having been responsible for a certain amount of the [*] Since answering several emails a day about why their results were different was taking up far too much time. I think --as-cran is great.) Paul On 12-03-27 02:19 PM, Uwe Ligges wrote: On 27.03.2012 19:10, Jeffrey Ryan wrote: Is there a distinction as to NOTE vs. WARNING that is documented? I've always assumed (wrongly?) that NOTES weren't an issue with publishing on CRAN, but that they may change to WARNINGS at some point. We won't kick packages off CRAN for Notes (but we will if Warnings are not fixed), but we may not accept new submissions with significant Notes. Best, Uwe Ligges Is the process by which this happens documented somewhere? Jeff On 3/27/12 11:09 AM, Gabor Grothendieckggrothendi...@gmail.com wrote: 2012/3/27 Uwe Liggeslig...@statistik.tu-dortmund.de: On 27.03.2012 17:09, Gabor Grothendieck wrote: On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Regarding the part about warnings or significant notes in that page, its impossible to know which notes are significant and which ones are not significant except by trial and error. Right, it needs human inspection to identify false positives. We believe most package maintainers are able to see if he or she is hit by such a false positive. The problem is that a note is generated and the note is correct. Its not a false positive. But that does not tell you whether its significant or not. There is no way to know. One can either try to remove all notes (which may not be feasible) or just upload it and by trial and error find out if its accepted or not. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] serialization regression in 2.15.0 beta
In case anyone is concerned that this regression will affect them, the code was reverted to the 2.14.x behavior by r58842 | ripley | 2012-03-26 08:12:43 -0400 (Mon, 26 Mar 2012) | 1 line Changed paths: M /branches/R-2-15-branch/doc/NEWS.Rd M /branches/R-2-15-branch/src/library/parallel/R/unix/forkCluster.R M /branches/R-2-15-branch/src/library/parallel/R/unix/mcfork.R revert to XDR serialization for 2.15.0 Thanks, Ben I am experiencing a problem related to serialization behavior in 2.15.0 beta (binary installed from Debian unstable) and 2.16.0 (from svn) that is not present in 2.14.2 (binary from Debian testing). I don't fully understand the problem. Also, I tried but have not yet been able to create a small, self-contained example that reproduces the problem. However, I do have a large, not self-contained example, which requires an alpha version (not yet on CRAN) of the mi package (the mi package on CRAN would not exhibit this issue). Anyone interested in reproducing the problem can follow the readme.txt file in this directory: http://www.columbia.edu/~bg2382/mi/serialization/ I track r-devel with git-svn and was able to git bisect to svn commit r58219 commit 799102bd9d0266fe89c3120981decf0b1f17ef11 Author: ripley ripley at 00db46b3-68df-0310-9c12-caf00c1e9a41 Date: Sat Jan 28 15:02:34 2012 + make use of non-xdr serialization;. although this commit could merely expose the problem rather than cause it. The problem occurs when the FUN called by mclapply() in the parallel package returns a S4 object that contains a slot (called X) that is a large matrix, specifically a model matrix similar to that produced by glm(). Some columns of this matrix get corrupted with wrong values (usually zero, but sometimes NaN or 10^300ish), which can be seen by examining X right before FUN returns (to mclapply()'s environment) and comparing to the same X after mclapply() returns to the calling environment. Part of svn commit r58219 is this hunk diff --git a/src/library/parallel/R/unix/mcfork.R b/src/library/parallel/R/unix/mcfork.R index 8e27534..4f92193 100644 --- a/src/library/parallel/R/unix/mcfork.R +++ b/src/library/parallel/R/unix/mcfork.R @@ -82,7 +82,8 @@ mckill - function(process, signal = 2L) ## used by mcparallel, mclapply sendMaster - function(what) { -if (!is.raw(what)) what - serialize(what, NULL, FALSE) +# This is talking to the same machine, so no point in using xdr. +if (!is.raw(what)) what - serialize(what, NULL, xdr = FALSE) .Call(C_mc_send_master, what, PACKAGE = parallel) } Contrary to the comment, I have found that if I specify xdr = TRUE, I get the expected (non-corrupted X slot) behavior in 2.16.0, even though it is forking locally on my 64bit Debian laptop with a little endian i7 processor, whose specs are goodrich at CYBERPOWERPC:/tmp/serialization$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz stepping: 7 microcode : 0x17 cpu MHz : 800.000 cache size : 6144 KB physical id : 0 siblings: 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid bogomips: 3990.83 clflush size: 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: ... processor : 7 [same as processor 0] So, to summarize I get the good behavior on R 2.14.2 when using mclapply(), on 2.15.0 beta when using lapply(), and on 2.16.0 using mclapply() iff I patch in xdr = TRUE in sendMaster(). I get the bad behavior on 2.15.0 beta and unpatched 2.16.0 when using mclapply(). My session info: sessionInfo() R version 2.15.0 beta (2012-03-16 r58769) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached
Re: [Rd] CRAN policies
I have been wondering if it is possible to automate the checking process to reduce human efforts, e.g. automatically check the packages submitted to FTP, and send the package maintainer an email in case of warnings or errors (otherwise just move it to CRAN); package maintainers can appeal for a manual check by CRAN maintainers in case of false positives. I've started using win-builder before submitting to CRAN. This often picks up problems that I don't see locally. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On Tue, Mar 27, 2012 at 6:52 AM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please Thanks for the pointer - I did not know that this page existed. In general, is there some easy way to track changes to this page and the R extension manual over time? It is difficult to keep track of the best practices. I'd also like to get clarification on Packages should not write in the users' home filespace, nor anywhere else on the file system apart from the R session's temporary directory (or during installation in the location pointed to by TMPDIR: and such usage should be cleaned up). - what is recommended practice for packages to maintain state across instances? Operating systems have standards for where applications can store settings (e.g. as described in http://pypi.python.org/pypi/appdirs/1.2.0). Is it acceptable to for packages to follow these conventions? Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
Lots of very sensible policies here. I have one request as someone who has in several cases had to involve company lawyers over intellectual property issues with packages on CRAN -- the first bullet point on ownership of copyright and intellectual property rights could be strengthened further. To the existing text The ownership of copyright and intellectual property rights of all components of the package must be clear and unambiguous (including from the authors specification in the DESCRIPTION file). Where code is copied (or derived) from the work of others (including from R itself), care must be taken that any copyright statements are preserved and authorship is not misrepresented. Trademarks must be respected. I would add a few additional points : 1. The text of the license itself should be included in the package in a LICENSE or COPYING file, as most of these licenses have things that need to be filled in with names and other data, and just referencing a license name in the DESCRIPTION file is not really a great way to deal with licensing metadata when used exclusively (it's a great complement to a full, filled-out license in the package itself). 2. Per file copyright comment headers can help immensely with ensuring compliance and the accidental incorporation of files under a different license. Comment header blocks with the author name and terms of distribution could be recommended for all source files. - Murray On Tue, Mar 27, 2012 at 4:52 AM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] serialization regression in 2.15.0 beta
On 27/03/2012 22:01, Ben Goodrich wrote: In case anyone is concerned that this regression will affect them, the code was reverted to the 2.14.x behavior by r58842 | ripley | 2012-03-26 08:12:43 -0400 (Mon, 26 Mar 2012) | 1 line Changed paths: M /branches/R-2-15-branch/doc/NEWS.Rd M /branches/R-2-15-branch/src/library/parallel/R/unix/forkCluster.R M /branches/R-2-15-branch/src/library/parallel/R/unix/mcfork.R revert to XDR serialization for 2.15.0 But the underlying problem (in non-xdr binary unserialization) is AFAWK fixed: it was just that at this late stage there was too little time to test thoroughly before release. Please test R-devel on your own problem (we haven't: the issue was found using a different example from elsewhere). Thanks, Ben I am experiencing a problem related to serialization behavior in 2.15.0 beta (binary installed from Debian unstable) and 2.16.0 (from svn) that is not present in 2.14.2 (binary from Debian testing). I don't fully understand the problem. Also, I tried but have not yet been able to create a small, self-contained example that reproduces the problem. However, I do have a large, not self-contained example, which requires an alpha version (not yet on CRAN) of the mi package (the mi package on CRAN would not exhibit this issue). Anyone interested in reproducing the problem can follow the readme.txt file in this directory: http://www.columbia.edu/~bg2382/mi/serialization/ I track r-devel with git-svn and was able to git bisect to svn commit r58219 commit 799102bd9d0266fe89c3120981decf0b1f17ef11 Author: ripleyripley at 00db46b3-68df-0310-9c12-caf00c1e9a41 Date: Sat Jan 28 15:02:34 2012 + make use of non-xdr serialization;. although this commit could merely expose the problem rather than cause it. The problem occurs when the FUN called by mclapply() in the parallel package returns a S4 object that contains a slot (called X) that is a large matrix, specifically a model matrix similar to that produced by glm(). Some columns of this matrix get corrupted with wrong values (usually zero, but sometimes NaN or 10^300ish), which can be seen by examining X right before FUN returns (to mclapply()'s environment) and comparing to the same X after mclapply() returns to the calling environment. Part of svn commit r58219 is this hunk diff --git a/src/library/parallel/R/unix/mcfork.R b/src/library/parallel/R/unix/mcfork.R index 8e27534..4f92193 100644 --- a/src/library/parallel/R/unix/mcfork.R +++ b/src/library/parallel/R/unix/mcfork.R @@ -82,7 +82,8 @@ mckill- function(process, signal = 2L) ## used by mcparallel, mclapply sendMaster- function(what) { -if (!is.raw(what)) what- serialize(what, NULL, FALSE) +# This is talking to the same machine, so no point in using xdr. +if (!is.raw(what)) what- serialize(what, NULL, xdr = FALSE) .Call(C_mc_send_master, what, PACKAGE = parallel) } Contrary to the comment, I have found that if I specify xdr = TRUE, I get the expected (non-corrupted X slot) behavior in 2.16.0, even though it is forking locally on my 64bit Debian laptop with a little endian i7 processor, whose specs are goodrich at CYBERPOWERPC:/tmp/serialization$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz stepping: 7 microcode : 0x17 cpu MHz : 800.000 cache size : 6144 KB physical id : 0 siblings: 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid bogomips: 3990.83 clflush size: 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: ... processor : 7 [same as processor 0] So, to summarize I get the good behavior on R 2.14.2 when using mclapply(), on 2.15.0 beta when using lapply(), and on 2.16.0 using mclapply() iff I patch in xdr = TRUE in sendMaster(). I get the bad behavior on 2.15.0 beta and unpatched 2.16.0 when using mclapply(). My session info: sessionInfo() R version 2.15.0 beta (2012-03-16 r58769) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5]