[Rd] Conflicts between 'parallel' and 'Rprof', and between two parallel R sessions
Dear list, I observed two problems that I suppose are generic. First, using 'Rprof' to profile a parallel (based on the package 'parallel') code caused Error in unserialize(node$con) : error reading from connection Second, on a multicore desktop, I concurrently opened two terminals and ran two separate R sessions, both running (actually identical) parallel code (which sets up a cluster with as many nodes as there are cores). I got ...Error in socketConnection("localhost", port = port, server = TRUE, blocking = TRUE, : cannot open the connection In addition: Warning message: In socketConnection("localhost", port = port, server = TRUE, blocking = TRUE, : port 10187 cannot be opened Are there ways to do the two things above without problems? Thanks! Zepu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] avoid copying big object passed into optimize()
Thanks! I found, say, exp(x) causes 2 duplications whereas sum(x) 0 duplication. Is there any document to learn from about this? (first time list user. sorry if anything about the posting procedure is wrong) On Wed, Mar 9, 2011 at 6:16 PM, Matt Shotwell wrote: > On Wed, 2011-03-09 at 17:15 -0900, Zepu Zhang wrote: > > Hello list, > > > > I have the following scenario: > > > > f1 <- function(a) > > { > > # doing things; may need 'a', but does not change 'a'. > > > > g <- function(x) > > { > > sum(x + a)# Say. Use 'a'; does not change 'a'. > > The expression 'x + a' causes 'a' to be duplicated; 'x' is added to each > element of the duplicated vector, then returned. The sum occurs > afterward. To avoid this use an expression like: 'length(a) * x + > sum(a)'. Also, please see this recent thread regarding the > pass-by-value / pass-by-reference issue: > http://tolstoy.newcastle.edu.au/R/e13/help/11/03/6632.html > > > } > > > > optimize(f = g, lower = 0, upper = 1) > > } > > > > > > f2 <- function() > > { > > b <- runif(1000) # Create big object. > > > > f1(a = b) > > } > > > > > > My main concern is to reduce copying of the big object 'a'. Questions: > > > > (1) In f1, 'a' never appears on the LHS of assignment. Is it passed by > value > > or by reference? Say the situation is simpler and more general: no > > optimization call in f1. > > 'a' is passed by value, but not necessarily copied in memory. > > > (2) Is there any difference, as far as copying of the big 'a' is > concerned, > > if 'g' is changed to > >g <- function(x, b) { sum(x + b) } > > and called by > > optimize(f = g, lower = 0, upper = 1, b = a) > > No. > > > (3) Is 'a' passed into the C optimization function one-off, or again and > > again across the C-R interface? > > I don't think either is completely correct. But more to your point, 'a' > is not necessarily copied repeatedly. If you make the substitution I > suggested above for 'g', then 'a' is not repeatedly copied. > > > (4) Does it help if I remove the argument 'a' of 'f1', and let 'g' look > for > > it (of course it should be referred to as 'b' now) directly in the > > environment of 'f2'? > > No. 'g' would then search and find 'a' farther down the environment > tree. > > > (5) Any suggestions? > > Avoid operations that necessitate a copy. Compile R with > --enable-memory-profiling and use the tracemem function to help in this. > > > Many thanks for your help! > > > > Zepu > > > > [[alternative HTML version deleted]] > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] avoid copying big object passed into optimize()
Hello list, I have the following scenario: f1 <- function(a) { # doing things; may need 'a', but does not change 'a'. g <- function(x) { sum(x + a)# Say. Use 'a'; does not change 'a'. } optimize(f = g, lower = 0, upper = 1) } f2 <- function() { b <- runif(1000) # Create big object. f1(a = b) } My main concern is to reduce copying of the big object 'a'. Questions: (1) In f1, 'a' never appears on the LHS of assignment. Is it passed by value or by reference? Say the situation is simpler and more general: no optimization call in f1. (2) Is there any difference, as far as copying of the big 'a' is concerned, if 'g' is changed to g <- function(x, b) { sum(x + b) } and called by optimize(f = g, lower = 0, upper = 1, b = a) (3) Is 'a' passed into the C optimization function one-off, or again and again across the C-R interface? (4) Does it help if I remove the argument 'a' of 'f1', and let 'g' look for it (of course it should be referred to as 'b' now) directly in the environment of 'f2'? (5) Any suggestions? Many thanks for your help! Zepu [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] A Call for a Smaller R Core Package
(Below is my idea on an issue that has troubled me for a fairly long time. I hope it's not viewed as trouble making.) A Call for a Smaller R Core Package This document suggests downsizing the 'core' package of R by taking out some specialized functionalities to form their own packages. I'll use string related functions as examples, because I happened to be troubled by them today. 1. The core is too big R is a function rich environment. However, non-central functions are better organized in specialized packages. >From time to time I felt the need to go through the core package for a complete picture of what are there at my disposal, yet so far I haven't done that. In the 'R Reference Manual' the core package runs for over 400 pages with about 400 entries, and mysteriously some functions don't show up in the TOC, e.g. 'sub'. In the two-volume reference set printed by Network-Theory, the core is the entire first book. In contrast, the 'Intrinsic Functions' chapter of the classic Fortran reference "Fortran 95/2003 Explained" runs for maybe 30(?) pages. I flipped through it many times and I can say with confidence, "OK these are ALL the Fortran intrinsics and I know what they do." For R, I found it an intimidating task to flip through the 400+ pages core and retain a clear mind at the end. Below is a random sample of string related functions in the core package: agrep basename charmatch chartr gregexpr grep gsub regex regexpr strsplit strtrim strwrap sub In my opinion, anything that uses regular expressions belongs somewhere else. Even 'utils' seems to be a better place for random items than the 'core'. 2. Benefits of a smaller core a) A smaller core will be more carefully studied and better appreciated. If the R core functions were documented in 100 pages, I would be a much better R programmer than I am today because I would have singled out and studied the more fundamental routines about function calls, etc. The criteria for a function to be in the core seem to be: 1) fundamental; or 2) very often used. A smaller core is more stable. b) A specialized 'string' package makes string related functions much easier to find. It could be that I still need all the functions. But since they are grouped together, it greatly helps learning. I would be very rarely reinventing the wheel, because I could quickly get a sweeping view of the dedicated package. c) It will be easier to enrich string-related functionalities without perplexing the core. 3. Costs of such re-arrangements a) To the R development team (I don't really know.) For those utility functions that are frequently used in basic functions, they may well stay in the core. For those that are not, it may not be too difficult to move them around. The spin-off package may be always automatically loaded as a basic one, but as discussed above, a cleaning grouping greatly helps learning and finding things. b) To R users The system (both the core and the specialized package) will be easier to learn and use. -- Zepu Zhang, [EMAIL PROTECTED] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel