Re: [Rd] Background session with R

2017-07-11 Thread Norm Matloff
My Rdsm package will do what you want, 

https://cran.r-project.org/web/packages/Rdsm/index.html

Norm Matloff

> Message: 4
> Date: Mon, 10 Jul 2017 17:12:57 +
> From: "Stravs, Michael" <michael.str...@eawag.ch>
> To: "r-devel@r-project.org" <r-devel@r-project.org>
> Cc: "shiny-disc...@googlegroups.com" <shiny-disc...@googlegroups.com>
> Subject: [Rd] Background session with R
> Message-ID:
>   <9dd73f68ac266d4aa329e07b678177b191e37...@ee-mbx3.ee.emp-eaw.ch>
> Content-Type: text/plain; charset="UTF-8"
> 
> Hi,
> 
> I am working on some code to have a background R process running that I can 
> submit data to, check computation progress, and retrieve results later. I am 
> aware that "parallel" does a lot of that - however, "parallel" shuts down the 
> nodes when I quit the master process. On the contrary, I would want these 
> nodes to continue running, so I can fire up R again later and reconnect to 
> the nodes to retrieve the results.
> 
> The use case is Shiny apps, where I want a thin frontend as a GUI, workflow 
> launcher and result viewer, and launch background computation that isn't 
> dependent on the Shiny script staying alive.
> 
> Has this been done already, and/or are there simple modifications of 
> parallel/snow/etc that allow this? My current WIP thing uses Rserve.
> 
> (shiny-discuss cc'd).
> 
> Michael Stravs
> Eawag
> Umweltchemie
> BU E 23
> ?berlandstrasse 133
> 8600 D?bendorf
> +41 58 765 6742

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] reference class internals

2014-01-09 Thread Norm Matloff
I have a question about reference classes, which someone here
undoubtedly can answer immediately, saving me hours of wading through
indecipherable internal code. :-)  Thanks in advance.  

Reference class data is mutable, fine, but in what sense?  Is it really
physical,  or is it just a view given to the programmer?
 
If for instance I have vector as a field in a reference class, and I
change one element of the vector, is it really true that the change is
guaranteed to be made in-place, no copying, no memory reallocation etc?

Norm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] reference class internals

2014-01-09 Thread Norm Matloff

Bottom line:  Really no different from the case of ordinary vectors that
are not in reference classes, right?  In other words, not true
pass-by-reference.

Norm

On Thu, Jan 09, 2014 at 04:43:44PM -0600, Hadley Wickham wrote:
 It's a bit of a simplification, reference classes are wrappers around
 environments.  So if modifying a value in an environment would create
 a copy, then modifying the same value in a reference class will also
 create a copy.
 
 The situation with modifying a vector is a bit complicated as it will
 sometimes be modified in place and sometimes be duplicated and
 modified (depending on whether its NAMED attribute is 1 or 2, and
 exactly how you're modifying it).
 
 Hadley
 
 On Thu, Jan 9, 2014 at 4:33 PM, Norm Matloff matl...@cs.ucdavis.edu wrote:
  I have a question about reference classes, which someone here
  undoubtedly can answer immediately, saving me hours of wading through
  indecipherable internal code. :-)  Thanks in advance.
 
  Reference class data is mutable, fine, but in what sense?  Is it really
  physical,  or is it just a view given to the programmer?
 
  If for instance I have vector as a field in a reference class, and I
  change one element of the vector, is it really true that the change is
  guaranteed to be made in-place, no copying, no memory reallocation etc?
 
  Norm
 
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
 
 
 
 -- 
 http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] reference class internals

2014-01-09 Thread Norm Matloff

Thanks, Hadley and Simon.

The reason I asked today was that when reference classes first came out,
it had appeared to me that there is no peformance advantage to using
reference classes, that it was mainly a style issue (encapsulation,
etc.).  Unless I'm missing something, both of you have confirmed my
original impression, correct?

Norm

On Thu, Jan 09, 2014 at 09:44:10PM -0500, Simon Urbanek wrote:
 On Jan 9, 2014, at 6:20 PM, Norm Matloff matl...@cs.ucdavis.edu wrote:
 
  Bottom line:  Really no different from the case of ordinary vectors that 
  are not in reference classes, right?  In other words, not true 
  pass-by-reference.
  
 
 The pass-by-reference applies to the object itself, not necessarily to 
 anything you obtain by calling a function on the object (like extracting a 
 part from it). Vectors are not reference-semantics objects so regular rules 
 apply.
 
 If you pass a reference semantics object to a function, the function can 
 modify the object. If you pass any other object, the contents are guaranteed 
 to not be touched. Reference-semantics objects in R are literally passed by 
 reference (same C pointer), so yes, it is true pass-by-reference.
 
 Cheers,
 Simon
 
 
 (*) - technically, there is a thin non-refernce wrapper around the instances 
 of reference classes, because there are things you don't want to happen to 
 your ref-semantics instance - e.g. you don't want unclass(x) to destroy x and 
 all instances of it (which it would do if there was no wrapper). But the 
 actual payload of the object is a true ref-semantics object - an environment 
 - that is always passed by reference.
 
 
 
  Norm
  
  On Thu, Jan 09, 2014 at 04:43:44PM -0600, Hadley Wickham wrote:
  It's a bit of a simplification, reference classes are wrappers around
  environments.  So if modifying a value in an environment would create
  a copy, then modifying the same value in a reference class will also
  create a copy.
  
  The situation with modifying a vector is a bit complicated as it will
  sometimes be modified in place and sometimes be duplicated and
  modified (depending on whether its NAMED attribute is 1 or 2, and
  exactly how you're modifying it).
  
  Hadley
  
  On Thu, Jan 9, 2014 at 4:33 PM, Norm Matloff matl...@cs.ucdavis.edu 
  wrote:
  I have a question about reference classes, which someone here
  undoubtedly can answer immediately, saving me hours of wading through
  indecipherable internal code. :-)  Thanks in advance.
  
  Reference class data is mutable, fine, but in what sense?  Is it really
  physical,  or is it just a view given to the programmer?
  
  If for instance I have vector as a field in a reference class, and I
  change one element of the vector, is it really true that the change is
  guaranteed to be made in-place, no copying, no memory reallocation etc?
  
  Norm
  
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
  
  
  
  -- 
  http://had.co.nz/
  
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
  


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] reference class internals

2014-01-09 Thread Norm Matloff

I guess I should explain where I'm coming from in all this.

I've always been something of a skeptic on object-oriented programming.
Though I agree it has some advantages, and I do use it myself (in
Python), in general I think it makes one work far too hard for the
potential benefit.  C++ templates (which I use in Thrust) drive me
crazy, very frustrating.

So I am, for better or worse, one of those people who don't even like S4
(again a style issue).  Obviously those who do like S4 may get a
performance benefit via reference classes in the situation Martin
mentions below.

I've been meaning for some time to look into whether there might
actually be a performance benefit for non-OOP programmers like me,
thinking the answer would be no but wanting to confirm.  So,
today I finally got around to asking, and immediately got three quick,
cogent and informative replies.  This testifies to the quality of the
membership of this list!  Thanks very much.

Norm

On Thu, Jan 09, 2014 at 08:27:09PM -0800, Martin Morgan wrote:
 On 01/09/2014 07:53 PM, Norm Matloff wrote:
 
 Thanks, Hadley and Simon.
 
 The reason I asked today was that when reference classes first came out,
 it had appeared to me that there is no peformance advantage to using
 reference classes, that it was mainly a style issue (encapsulation,
 etc.).  Unless I'm missing something, both of you have confirmed my
 original impression, correct?
 
 We've used reference classes for performance benefit. E.g., updating
 a single (e.g., small) field in an S4 object triggers an entire copy
 of the object, whereas for a reference class the fields can be
 updated independently. This is especially true inside function
 (e.g., method) calls (e.g., slot access), where the object is marked
 to be duplicated.
 
 
  a = setClass(A, representation(x=numeric))(x=1:5)
 .Internal(inspect(a))
 @5237508 25 S4SXP g0c0 [OBJ,NAM(2),S4,gp=0x10,ATT]
 ATTRIB:
   @5237460 02 LISTSXP g0c0 []
 TAG: @12ea3a0 01 SYMSXP g0c0 [NAM(2)] x
 @5225db8 13 INTSXP g0c3 [NAM(2)] (len=5, tl=0) 1,2,3,4,5
 TAG: @1284b08 01 SYMSXP g0c0 [LCK,gp=0x4000] class (has value)
 @52355c8 16 STRSXP g0c1 [NAM(2),ATT] (len=1, tl=0)
   @4740e48 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] A
 ATTRIB:
   @52373f0 02 LISTSXP g0c0 []
   TAG: @128e500 01 SYMSXP g0c0 [NAM(2)] package
   @5235598 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0)
 @12ee2b8 09 CHARSXP g0c2 [gp=0x61] [ASCII] [cached] .GlobalEnv
 a@x[1]=2L
 .Internal(inspect(a))  ## almost everything duplicated!
 @5243cd0 25 S4SXP g0c0 [OBJ,NAM(2),S4,gp=0x10,ATT]
 ATTRIB:
   @5243c60 02 LISTSXP g0c0 []
 TAG: @12ea3a0 01 SYMSXP g0c0 [NAM(2)] x
 @5225b30 13 INTSXP g0c3 [NAM(1)] (len=5, tl=0) 2,2,3,4,5
 TAG: @1284b08 01 SYMSXP g0c0 [LCK,gp=0x4000] class (has value)
 @52405f8 16 STRSXP g0c1 [NAM(2),ATT] (len=1, tl=0)
   @4740e48 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] A
 ATTRIB:
   @5243bf0 02 LISTSXP g0c0 []
   TAG: @128e500 01 SYMSXP g0c0 [NAM(2)] package
   @52405c8 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0)
 @12ee2b8 09 CHARSXP g0c2 [gp=0x61] [ASCII] [cached] .GlobalEnv
 
 (this also influence performance of other R objects, of course, e.g.,
 
  f = function(x) { x@a = 2L; x }
  l = list(a=1:5); .Internal(inspect(l))
 @53f8448 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
   @53cef48 13 INTSXP g0c3 [] (len=5, tl=0) 1,2,3,4,5
 ATTRIB:
   @53f9190 02 LISTSXP g0c0 []
 TAG: @1284638 01 SYMSXP g0c0 [LCK,gp=0x4000] names (has value)
 @53f8418 16 STRSXP g0c1 [] (len=1, tl=0)
   @146b128 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] a
  .Internal(inspect(f(l)))
 @53f83e8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
   @53cef00 13 INTSXP g0c3 [] (len=5, tl=0) 2,2,3,4,5
 ATTRIB:
   @53f9988 02 LISTSXP g0c0 []
 TAG: @1284638 01 SYMSXP g0c0 [LCK,gp=0x4000] names (has value)
 @53f83b8 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0)
   @146b128 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] a
 
 Copies are localized to the updated field with reference classes
 (can't show this with .Internal(inspect()), though, because x =
 new.env(); x$x = x; .Internal(insepct(x)) [mimicking .self in
 reference classes] has an infinite (? I didn't wait that long)
 recursion).
 
 I think actually reference classes have a surprising performance
 _hit_ compared to other R approaches to minimizing copying; this has
 come up on this or the R mailing list before, but I've lost track of
 the original. Here's a StackOverflow version
 
 http://stackoverflow.com/questions/18677696/stack-class-in-r-something-more-concise/18678440#18678440
 
 Martin
 
 
 Norm
 
 On Thu, Jan 09, 2014 at 09:44:10PM -0500, Simon Urbanek wrote:
 On Jan 9, 2014, at 6:20 PM, Norm Matloff matl...@cs.ucdavis.edu wrote:
 
 Bottom line:  Really no different from the case of ordinary vectors that 
 are not in reference classes, right?  In other words, not true 
 pass-by-reference.
 
 
 The pass-by-reference applies to the object itself, not necessarily

Re: [Rd] Regression stars

2013-02-09 Thread Norm Matloff
Thanks for bringing this up, Frank.

Since many of us are educators, I'd like to suggest a bolder approach.
Discontinue even offering the stars as an option.  Sadly, we can't stop
reporting p-values, as the world expects them, but does R need to cater
to that attitude by offering star display?  For that matter, why not
have R report confidence intervals as a default?

Many years ago, I wrote a short textbook on stat, and included a
substantial section on the dangers of significance testing.  All three
internal reviewers liked it, but the funny part is that all three said,
I agree with this, but no one else will. :-)

Norm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Regression stars

2013-02-09 Thread Norm Matloff
I appreciate Tim's comments.

I myself have a social science paper coming out soon in which I felt
forced to use p-values, given their ubiquity.  However, I also told
readers of the paper that confidence intervals are much more informative
and I do provide them.  As I said earlier, there is no avoiding that,
and R needs to report p-values for that reason.  

Instead, the question is what to do about the stars; I proposed
eliminating them altogether.  Star-crazed users know how to determine
them themselves from the p-values, but deleting them from R would send a
message.

I did say my proposal was bold, which really meant I was suggesting
that R do SOMETHING to send that message, not necessarily star
elimination.

One such something would be the proposal I made, which would be to add
confidence intervals to the output.  This too could be just an option,
but again offering that option would send a message.  Indeed, I would
suggest that the help page explain that confidence intervals are more
informative.  (The help page could make a similar statement regarding
the stars.)

When I pitch R to people, I say that in addition to the large function
and library base and the nice graphics capabilities, R is above all
Statistically Correct--it's written by statisticians who know what they
are doing, rather than some programmer simply implementing a formula
from a textbook.  I know that a lot of people feel this is one of R's
biggest strengths.  Given that, one might argue that R should do what it
can to help users engage in good statistical practice.  I think this was
Frank's point.

Norm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)

2012-12-15 Thread Norm Matloff
Henrik Bengtsson h...@biostat.ucsf.edu wrote:

^ In the 'parallel' package there is detectCores(), which tries its best
^ to infer the number of cores on the current machine.  This is useful
^ if you wish to utilize the *maximum* number of cores on the machine.
^ Several are using this to set the number of cores when parallelizing,
^ sometimes also hardcoded within 3rd-party scripts/package code, but
^ there are several settings where you wish to use fewer, e.g. in a
^ compute cluster where you R session is given only a portion of the
^ cores available.  Because of this, I'd like to propose to add
^ getCores(), which by default returns what detectCores() gives, but can

Even if one has the entire machine to oneself, there is often another
very good reason not to use the maximum number of cores:  Using the
maximum number of cores may reduce performance.  This is true in
general, and sometimes especially true when the inferred number of cores
includes hyperthreading.

Norm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)

2012-12-15 Thread Norm Matloff
On Sat, Dec 15, 2012 at 10:58:34PM -0500, Simon Urbanek wrote:
 On Dec 15, 2012, at 7:38 PM, Norm Matloff wrote:
 
  Even if one has the entire machine to oneself, there is often
  another very good reason not to use the maximum number of cores:
  Using the maximum number of cores may reduce performance.  This is
  true in general, and sometimes especially true when the inferred
  number of cores includes hyperthreading.
 
 Actually, the converse is often true (it depends on the machine
 architecture, though - I'm assuming true SMP machines here) -- often
 it is beneficial to run more threads than cores because the time spent
 waiting for access outside the CPU can be used by other thread that
 can continue computing. This is in particular true for parallel
 because of the setup overhead -- typically the real problem is memory,
 though. That said, the balance is heavily machine and task dependent
 so any default will be bad for some cases. Typically, for commodity
 machines with couple dozen cores it's good to overload, for bigger
 machines it's bad.

Yes, it sometimes is beneficial to run more threads than cores.  But I
typically is a rather risky term to use.  As usual, this is very
problem-dependent, and what is typical for one person may not be so
for another.  I would speculate, for instance, that most embarrassingly
parallel applications can benefit from some degree of oversubscription,
but even then I wouldn't go out on a limb.

At any rate, the main point for the OP is that there are performance
reasons not to set the number of threads/processors equal to the number
of cores.

Norm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] GPU Computing

2012-08-21 Thread Norm Matloff
Oops, sent to the wrong list (again), sorry.

Date: Tue, 21 Aug 2012 13:54:48 -0700
From: Norm Matloff matl...@cs.ucdavis.edu
To: r-sig-...@r-project.org
Subject: Re: [R-sig-hpc] GPU Computing

Peter Chausse wrote:

 I am looking for a function similar to mclapply() that would work with
 GPU cores. I have looked at all possible packages related to GPU...

The short answer is no.

Functions like mclapply() work on, say a quad core machine, by setting
up new invocations of R to run on each of the four CPU cores.  What you
have in mind would mean having R run on each of the GPU cores.  This is
not possible, for a variety of reasons (R needs a terminal shell, it
needs I/O etc.).

To have R take advantage of GPUs, one must write C/C++ (or FORTRAN)
code.  Currently packages that do this are very limited.  See the
relevant CRAN Task View, at

http://cran.r-project.org/web/views/HighPerformanceComputing.html

You might also take a look at my Rth package, at

http://heather.cs.ucdavis.edu/~matloff/rth.html 

Norm Matloff

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Use GPU in R with .Call

2012-07-22 Thread Norm Matloff
[Sorry, originally sent to wrong list.]

I'm not exactly sure what you are asking, Raymond, but this may answer
your question.

Say you have a file x.cu.  After compiling with nvcc -c as you did, then
do something like this:

setenv PKG_LIBS -L/usr/local/cuda/lib -lcudart
R CMD SHLIB x.o -o x.so

PKG_LIBS is an environment variable used by R CMD SHLIB.

Of course, you need to translate the setting of the environment variable
from Linux C shell to Windows, and substitute your location of the CUDA
library.

Norm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] bug (or feature) in alpha 2.13?

2011-03-27 Thread Norm Matloff

Thanks very much, Duncan.

Norm

On Sun, Mar 27, 2011 at 08:57:08AM -0400, Duncan Murdoch wrote:

 Fixed now.  Because of the internal change to srcref records

   \item \code{srcref} attributes now include two additional
   line number values, recording the line numbers in the order they
   were parsed.

 the code that saved the current location didn't recognize the record,  
 and skipped saving it.

 Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] bug (or feature) in alpha 2.13?

2011-03-26 Thread Norm Matloff


The pattern (I can make a simple example if needed):

source(x.R)
options(error=recover)
x - ...
f(x)  # f() from x.R 
  (subscript bounds error, now in recover())
   Selection: 1
   Browse[1] where

In the output from where, there should be information on the line
number at which the user code blew up.  It's there in 2.12, but not in
2.13, from what I can see.

Norm Matloff

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] GUI's and R background processes

2010-12-17 Thread Norm Matloff
(Sorry, originally sent to wrong list.)

Anne, you can accomplish your goal by using my Rdsm package, which adds
a threads-like capability to R.  You can download it from CRAN.  

Look in particular in the examples/ directory.  The file WebProbe.R is
pretty much exactly the same usage that you want.  Look at Auction.R
too. 

You may also find my UseR! presentation on Rdsm to be helpful,
user2010.org/slides/Matloff.pdf

You could do the same thing, though less directly and I believe less
conveniently, using some of the packages Louis mentioned, as well as
bigmemory.

Norm Matloff

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] full copy on assignment?

2010-04-04 Thread Norm Matloff
Thanks very much.

By the way, I tried setting a GDB breakpoint at duplicate1(), with the
following:

   x - 1:1000
   x[3] - 8
   x[33] - 88

I found that duplicate1() was called on both of the latter two lines.
I was a bit surprised, since change-on-write would seem to imply that
copying would be done in that second line but NOT on the third.
Moreover, system.time() gave 0.284 user time for the second and 0 on
the third.  YET duplicate1() WAS called on the third, and in stepping
through the code, there didn't seem to be an immediate exit.

Thanks to both John and Duncan for their comment on the fact that using
[- directly is a very different situation.  That's not what I asked,
but the comment is useful to me for other reasons.

Norm

 Message: 4
 Date: Sat, 03 Apr 2010 17:54:58 -0700
 From: John Chambers j...@r-project.org
 To: r-devel@r-project.org
 Subject: Re: [Rd] full copy on assignment?
...
...
 How often does y get duplicated? Hopefully not a million times.  One can 
 look at this in gdb, by trapping calls to duplicate1.  The answer is:  
 just once, to ensure that the object is local.  Then the duplicated 
 version has only one reference and the primitive replacement doesn't 
 copy it.
...

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] full copy on assignment?

2010-04-03 Thread Norm Matloff

Here's a basic question that doesn't seem to be completely answered in
the docs, and which unfortunately I've not had time to figure out by
wading through the R source code:

In a vector (or array) element assignment such as 

   z[3] - 8 

is there in actuality a full rewriting of the entire vector pointed to
by z, as implied by

   z - [-(z,3,value=8)

Assume that an element of z has already being changed previously, so
that copy-on-change issues don't apply, with z being reassigned back to
the same memory address.

I seem to recall reading somewhere that recent R versions make some
attempt to avoid rewriting the entire vector, and my timing experiments
seem to suggest that it's true.  

So, is a full rewrite avoided?  And where in the source code is this
done?

Thanks.

Norm Matloff

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] full copy on assignment?

2010-04-03 Thread Norm Matloff

Thanks, Martin and Duncan, for the quick, cleary replies.

Norm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel