Re: [Rd] Background session with R
My Rdsm package will do what you want, https://cran.r-project.org/web/packages/Rdsm/index.html Norm Matloff > Message: 4 > Date: Mon, 10 Jul 2017 17:12:57 + > From: "Stravs, Michael" <michael.str...@eawag.ch> > To: "r-devel@r-project.org" <r-devel@r-project.org> > Cc: "shiny-disc...@googlegroups.com" <shiny-disc...@googlegroups.com> > Subject: [Rd] Background session with R > Message-ID: > <9dd73f68ac266d4aa329e07b678177b191e37...@ee-mbx3.ee.emp-eaw.ch> > Content-Type: text/plain; charset="UTF-8" > > Hi, > > I am working on some code to have a background R process running that I can > submit data to, check computation progress, and retrieve results later. I am > aware that "parallel" does a lot of that - however, "parallel" shuts down the > nodes when I quit the master process. On the contrary, I would want these > nodes to continue running, so I can fire up R again later and reconnect to > the nodes to retrieve the results. > > The use case is Shiny apps, where I want a thin frontend as a GUI, workflow > launcher and result viewer, and launch background computation that isn't > dependent on the Shiny script staying alive. > > Has this been done already, and/or are there simple modifications of > parallel/snow/etc that allow this? My current WIP thing uses Rserve. > > (shiny-discuss cc'd). > > Michael Stravs > Eawag > Umweltchemie > BU E 23 > ?berlandstrasse 133 > 8600 D?bendorf > +41 58 765 6742 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] reference class internals
I have a question about reference classes, which someone here undoubtedly can answer immediately, saving me hours of wading through indecipherable internal code. :-) Thanks in advance. Reference class data is mutable, fine, but in what sense? Is it really physical, or is it just a view given to the programmer? If for instance I have vector as a field in a reference class, and I change one element of the vector, is it really true that the change is guaranteed to be made in-place, no copying, no memory reallocation etc? Norm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] reference class internals
Bottom line: Really no different from the case of ordinary vectors that are not in reference classes, right? In other words, not true pass-by-reference. Norm On Thu, Jan 09, 2014 at 04:43:44PM -0600, Hadley Wickham wrote: It's a bit of a simplification, reference classes are wrappers around environments. So if modifying a value in an environment would create a copy, then modifying the same value in a reference class will also create a copy. The situation with modifying a vector is a bit complicated as it will sometimes be modified in place and sometimes be duplicated and modified (depending on whether its NAMED attribute is 1 or 2, and exactly how you're modifying it). Hadley On Thu, Jan 9, 2014 at 4:33 PM, Norm Matloff matl...@cs.ucdavis.edu wrote: I have a question about reference classes, which someone here undoubtedly can answer immediately, saving me hours of wading through indecipherable internal code. :-) Thanks in advance. Reference class data is mutable, fine, but in what sense? Is it really physical, or is it just a view given to the programmer? If for instance I have vector as a field in a reference class, and I change one element of the vector, is it really true that the change is guaranteed to be made in-place, no copying, no memory reallocation etc? Norm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] reference class internals
Thanks, Hadley and Simon. The reason I asked today was that when reference classes first came out, it had appeared to me that there is no peformance advantage to using reference classes, that it was mainly a style issue (encapsulation, etc.). Unless I'm missing something, both of you have confirmed my original impression, correct? Norm On Thu, Jan 09, 2014 at 09:44:10PM -0500, Simon Urbanek wrote: On Jan 9, 2014, at 6:20 PM, Norm Matloff matl...@cs.ucdavis.edu wrote: Bottom line: Really no different from the case of ordinary vectors that are not in reference classes, right? In other words, not true pass-by-reference. The pass-by-reference applies to the object itself, not necessarily to anything you obtain by calling a function on the object (like extracting a part from it). Vectors are not reference-semantics objects so regular rules apply. If you pass a reference semantics object to a function, the function can modify the object. If you pass any other object, the contents are guaranteed to not be touched. Reference-semantics objects in R are literally passed by reference (same C pointer), so yes, it is true pass-by-reference. Cheers, Simon (*) - technically, there is a thin non-refernce wrapper around the instances of reference classes, because there are things you don't want to happen to your ref-semantics instance - e.g. you don't want unclass(x) to destroy x and all instances of it (which it would do if there was no wrapper). But the actual payload of the object is a true ref-semantics object - an environment - that is always passed by reference. Norm On Thu, Jan 09, 2014 at 04:43:44PM -0600, Hadley Wickham wrote: It's a bit of a simplification, reference classes are wrappers around environments. So if modifying a value in an environment would create a copy, then modifying the same value in a reference class will also create a copy. The situation with modifying a vector is a bit complicated as it will sometimes be modified in place and sometimes be duplicated and modified (depending on whether its NAMED attribute is 1 or 2, and exactly how you're modifying it). Hadley On Thu, Jan 9, 2014 at 4:33 PM, Norm Matloff matl...@cs.ucdavis.edu wrote: I have a question about reference classes, which someone here undoubtedly can answer immediately, saving me hours of wading through indecipherable internal code. :-) Thanks in advance. Reference class data is mutable, fine, but in what sense? Is it really physical, or is it just a view given to the programmer? If for instance I have vector as a field in a reference class, and I change one element of the vector, is it really true that the change is guaranteed to be made in-place, no copying, no memory reallocation etc? Norm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] reference class internals
I guess I should explain where I'm coming from in all this. I've always been something of a skeptic on object-oriented programming. Though I agree it has some advantages, and I do use it myself (in Python), in general I think it makes one work far too hard for the potential benefit. C++ templates (which I use in Thrust) drive me crazy, very frustrating. So I am, for better or worse, one of those people who don't even like S4 (again a style issue). Obviously those who do like S4 may get a performance benefit via reference classes in the situation Martin mentions below. I've been meaning for some time to look into whether there might actually be a performance benefit for non-OOP programmers like me, thinking the answer would be no but wanting to confirm. So, today I finally got around to asking, and immediately got three quick, cogent and informative replies. This testifies to the quality of the membership of this list! Thanks very much. Norm On Thu, Jan 09, 2014 at 08:27:09PM -0800, Martin Morgan wrote: On 01/09/2014 07:53 PM, Norm Matloff wrote: Thanks, Hadley and Simon. The reason I asked today was that when reference classes first came out, it had appeared to me that there is no peformance advantage to using reference classes, that it was mainly a style issue (encapsulation, etc.). Unless I'm missing something, both of you have confirmed my original impression, correct? We've used reference classes for performance benefit. E.g., updating a single (e.g., small) field in an S4 object triggers an entire copy of the object, whereas for a reference class the fields can be updated independently. This is especially true inside function (e.g., method) calls (e.g., slot access), where the object is marked to be duplicated. a = setClass(A, representation(x=numeric))(x=1:5) .Internal(inspect(a)) @5237508 25 S4SXP g0c0 [OBJ,NAM(2),S4,gp=0x10,ATT] ATTRIB: @5237460 02 LISTSXP g0c0 [] TAG: @12ea3a0 01 SYMSXP g0c0 [NAM(2)] x @5225db8 13 INTSXP g0c3 [NAM(2)] (len=5, tl=0) 1,2,3,4,5 TAG: @1284b08 01 SYMSXP g0c0 [LCK,gp=0x4000] class (has value) @52355c8 16 STRSXP g0c1 [NAM(2),ATT] (len=1, tl=0) @4740e48 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] A ATTRIB: @52373f0 02 LISTSXP g0c0 [] TAG: @128e500 01 SYMSXP g0c0 [NAM(2)] package @5235598 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0) @12ee2b8 09 CHARSXP g0c2 [gp=0x61] [ASCII] [cached] .GlobalEnv a@x[1]=2L .Internal(inspect(a)) ## almost everything duplicated! @5243cd0 25 S4SXP g0c0 [OBJ,NAM(2),S4,gp=0x10,ATT] ATTRIB: @5243c60 02 LISTSXP g0c0 [] TAG: @12ea3a0 01 SYMSXP g0c0 [NAM(2)] x @5225b30 13 INTSXP g0c3 [NAM(1)] (len=5, tl=0) 2,2,3,4,5 TAG: @1284b08 01 SYMSXP g0c0 [LCK,gp=0x4000] class (has value) @52405f8 16 STRSXP g0c1 [NAM(2),ATT] (len=1, tl=0) @4740e48 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] A ATTRIB: @5243bf0 02 LISTSXP g0c0 [] TAG: @128e500 01 SYMSXP g0c0 [NAM(2)] package @52405c8 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0) @12ee2b8 09 CHARSXP g0c2 [gp=0x61] [ASCII] [cached] .GlobalEnv (this also influence performance of other R objects, of course, e.g., f = function(x) { x@a = 2L; x } l = list(a=1:5); .Internal(inspect(l)) @53f8448 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0) @53cef48 13 INTSXP g0c3 [] (len=5, tl=0) 1,2,3,4,5 ATTRIB: @53f9190 02 LISTSXP g0c0 [] TAG: @1284638 01 SYMSXP g0c0 [LCK,gp=0x4000] names (has value) @53f8418 16 STRSXP g0c1 [] (len=1, tl=0) @146b128 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] a .Internal(inspect(f(l))) @53f83e8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0) @53cef00 13 INTSXP g0c3 [] (len=5, tl=0) 2,2,3,4,5 ATTRIB: @53f9988 02 LISTSXP g0c0 [] TAG: @1284638 01 SYMSXP g0c0 [LCK,gp=0x4000] names (has value) @53f83b8 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0) @146b128 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] a Copies are localized to the updated field with reference classes (can't show this with .Internal(inspect()), though, because x = new.env(); x$x = x; .Internal(insepct(x)) [mimicking .self in reference classes] has an infinite (? I didn't wait that long) recursion). I think actually reference classes have a surprising performance _hit_ compared to other R approaches to minimizing copying; this has come up on this or the R mailing list before, but I've lost track of the original. Here's a StackOverflow version http://stackoverflow.com/questions/18677696/stack-class-in-r-something-more-concise/18678440#18678440 Martin Norm On Thu, Jan 09, 2014 at 09:44:10PM -0500, Simon Urbanek wrote: On Jan 9, 2014, at 6:20 PM, Norm Matloff matl...@cs.ucdavis.edu wrote: Bottom line: Really no different from the case of ordinary vectors that are not in reference classes, right? In other words, not true pass-by-reference. The pass-by-reference applies to the object itself, not necessarily
Re: [Rd] Regression stars
Thanks for bringing this up, Frank. Since many of us are educators, I'd like to suggest a bolder approach. Discontinue even offering the stars as an option. Sadly, we can't stop reporting p-values, as the world expects them, but does R need to cater to that attitude by offering star display? For that matter, why not have R report confidence intervals as a default? Many years ago, I wrote a short textbook on stat, and included a substantial section on the dangers of significance testing. All three internal reviewers liked it, but the funny part is that all three said, I agree with this, but no one else will. :-) Norm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Regression stars
I appreciate Tim's comments. I myself have a social science paper coming out soon in which I felt forced to use p-values, given their ubiquity. However, I also told readers of the paper that confidence intervals are much more informative and I do provide them. As I said earlier, there is no avoiding that, and R needs to report p-values for that reason. Instead, the question is what to do about the stars; I proposed eliminating them altogether. Star-crazed users know how to determine them themselves from the p-values, but deleting them from R would send a message. I did say my proposal was bold, which really meant I was suggesting that R do SOMETHING to send that message, not necessarily star elimination. One such something would be the proposal I made, which would be to add confidence intervals to the output. This too could be just an option, but again offering that option would send a message. Indeed, I would suggest that the help page explain that confidence intervals are more informative. (The help page could make a similar statement regarding the stars.) When I pitch R to people, I say that in addition to the large function and library base and the nice graphics capabilities, R is above all Statistically Correct--it's written by statisticians who know what they are doing, rather than some programmer simply implementing a formula from a textbook. I know that a lot of people feel this is one of R's biggest strengths. Given that, one might argue that R should do what it can to help users engage in good statistical practice. I think this was Frank's point. Norm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)
Henrik Bengtsson h...@biostat.ucsf.edu wrote: ^ In the 'parallel' package there is detectCores(), which tries its best ^ to infer the number of cores on the current machine. This is useful ^ if you wish to utilize the *maximum* number of cores on the machine. ^ Several are using this to set the number of cores when parallelizing, ^ sometimes also hardcoded within 3rd-party scripts/package code, but ^ there are several settings where you wish to use fewer, e.g. in a ^ compute cluster where you R session is given only a portion of the ^ cores available. Because of this, I'd like to propose to add ^ getCores(), which by default returns what detectCores() gives, but can Even if one has the entire machine to oneself, there is often another very good reason not to use the maximum number of cores: Using the maximum number of cores may reduce performance. This is true in general, and sometimes especially true when the inferred number of cores includes hyperthreading. Norm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)
On Sat, Dec 15, 2012 at 10:58:34PM -0500, Simon Urbanek wrote: On Dec 15, 2012, at 7:38 PM, Norm Matloff wrote: Even if one has the entire machine to oneself, there is often another very good reason not to use the maximum number of cores: Using the maximum number of cores may reduce performance. This is true in general, and sometimes especially true when the inferred number of cores includes hyperthreading. Actually, the converse is often true (it depends on the machine architecture, though - I'm assuming true SMP machines here) -- often it is beneficial to run more threads than cores because the time spent waiting for access outside the CPU can be used by other thread that can continue computing. This is in particular true for parallel because of the setup overhead -- typically the real problem is memory, though. That said, the balance is heavily machine and task dependent so any default will be bad for some cases. Typically, for commodity machines with couple dozen cores it's good to overload, for bigger machines it's bad. Yes, it sometimes is beneficial to run more threads than cores. But I typically is a rather risky term to use. As usual, this is very problem-dependent, and what is typical for one person may not be so for another. I would speculate, for instance, that most embarrassingly parallel applications can benefit from some degree of oversubscription, but even then I wouldn't go out on a limb. At any rate, the main point for the OP is that there are performance reasons not to set the number of threads/processors equal to the number of cores. Norm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] GPU Computing
Oops, sent to the wrong list (again), sorry. Date: Tue, 21 Aug 2012 13:54:48 -0700 From: Norm Matloff matl...@cs.ucdavis.edu To: r-sig-...@r-project.org Subject: Re: [R-sig-hpc] GPU Computing Peter Chausse wrote: I am looking for a function similar to mclapply() that would work with GPU cores. I have looked at all possible packages related to GPU... The short answer is no. Functions like mclapply() work on, say a quad core machine, by setting up new invocations of R to run on each of the four CPU cores. What you have in mind would mean having R run on each of the GPU cores. This is not possible, for a variety of reasons (R needs a terminal shell, it needs I/O etc.). To have R take advantage of GPUs, one must write C/C++ (or FORTRAN) code. Currently packages that do this are very limited. See the relevant CRAN Task View, at http://cran.r-project.org/web/views/HighPerformanceComputing.html You might also take a look at my Rth package, at http://heather.cs.ucdavis.edu/~matloff/rth.html Norm Matloff __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Use GPU in R with .Call
[Sorry, originally sent to wrong list.] I'm not exactly sure what you are asking, Raymond, but this may answer your question. Say you have a file x.cu. After compiling with nvcc -c as you did, then do something like this: setenv PKG_LIBS -L/usr/local/cuda/lib -lcudart R CMD SHLIB x.o -o x.so PKG_LIBS is an environment variable used by R CMD SHLIB. Of course, you need to translate the setting of the environment variable from Linux C shell to Windows, and substitute your location of the CUDA library. Norm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (or feature) in alpha 2.13?
Thanks very much, Duncan. Norm On Sun, Mar 27, 2011 at 08:57:08AM -0400, Duncan Murdoch wrote: Fixed now. Because of the internal change to srcref records \item \code{srcref} attributes now include two additional line number values, recording the line numbers in the order they were parsed. the code that saved the current location didn't recognize the record, and skipped saving it. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] bug (or feature) in alpha 2.13?
The pattern (I can make a simple example if needed): source(x.R) options(error=recover) x - ... f(x) # f() from x.R (subscript bounds error, now in recover()) Selection: 1 Browse[1] where In the output from where, there should be information on the line number at which the user code blew up. It's there in 2.12, but not in 2.13, from what I can see. Norm Matloff __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] GUI's and R background processes
(Sorry, originally sent to wrong list.) Anne, you can accomplish your goal by using my Rdsm package, which adds a threads-like capability to R. You can download it from CRAN. Look in particular in the examples/ directory. The file WebProbe.R is pretty much exactly the same usage that you want. Look at Auction.R too. You may also find my UseR! presentation on Rdsm to be helpful, user2010.org/slides/Matloff.pdf You could do the same thing, though less directly and I believe less conveniently, using some of the packages Louis mentioned, as well as bigmemory. Norm Matloff __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] full copy on assignment?
Thanks very much. By the way, I tried setting a GDB breakpoint at duplicate1(), with the following: x - 1:1000 x[3] - 8 x[33] - 88 I found that duplicate1() was called on both of the latter two lines. I was a bit surprised, since change-on-write would seem to imply that copying would be done in that second line but NOT on the third. Moreover, system.time() gave 0.284 user time for the second and 0 on the third. YET duplicate1() WAS called on the third, and in stepping through the code, there didn't seem to be an immediate exit. Thanks to both John and Duncan for their comment on the fact that using [- directly is a very different situation. That's not what I asked, but the comment is useful to me for other reasons. Norm Message: 4 Date: Sat, 03 Apr 2010 17:54:58 -0700 From: John Chambers j...@r-project.org To: r-devel@r-project.org Subject: Re: [Rd] full copy on assignment? ... ... How often does y get duplicated? Hopefully not a million times. One can look at this in gdb, by trapping calls to duplicate1. The answer is: just once, to ensure that the object is local. Then the duplicated version has only one reference and the primitive replacement doesn't copy it. ... __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] full copy on assignment?
Here's a basic question that doesn't seem to be completely answered in the docs, and which unfortunately I've not had time to figure out by wading through the R source code: In a vector (or array) element assignment such as z[3] - 8 is there in actuality a full rewriting of the entire vector pointed to by z, as implied by z - [-(z,3,value=8) Assume that an element of z has already being changed previously, so that copy-on-change issues don't apply, with z being reassigned back to the same memory address. I seem to recall reading somewhere that recent R versions make some attempt to avoid rewriting the entire vector, and my timing experiments seem to suggest that it's true. So, is a full rewrite avoided? And where in the source code is this done? Thanks. Norm Matloff __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] full copy on assignment?
Thanks, Martin and Duncan, for the quick, cleary replies. Norm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel