[Rd] Conflicts between 'parallel' and 'Rprof', and between two parallel R sessions

2012-01-26 Thread Zepu Zhang
Dear list,

I observed two problems that I suppose are generic.

First, using 'Rprof' to profile a parallel (based on the package
'parallel') code caused

 Error in unserialize(node$con) : error reading from connection

Second, on a multicore desktop, I concurrently opened two terminals
and ran two separate R sessions, both running (actually
identical) parallel code (which sets up a cluster with as many nodes
as there are cores). I got

 ...Error in socketConnection("localhost", port = port, server = TRUE,
blocking = TRUE,  :   cannot open the connection
In addition: Warning message:
In socketConnection("localhost", port = port, server = TRUE, blocking = TRUE,  :
 port 10187 cannot be opened

Are there ways to do the two things above without problems?

Thanks!

Zepu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] avoid copying big object passed into optimize()

2011-03-17 Thread Zepu Zhang
Thanks! I found, say, exp(x) causes 2 duplications whereas sum(x) 0
duplication. Is there any document to learn from about this?

(first time list user. sorry if anything about the posting procedure is
wrong)

On Wed, Mar 9, 2011 at 6:16 PM, Matt Shotwell  wrote:

> On Wed, 2011-03-09 at 17:15 -0900, Zepu Zhang wrote:
> > Hello list,
> >
> > I have the following scenario:
> >
> > f1 <- function(a)
> > {
> >   # doing things; may need 'a', but does not change 'a'.
> >
> >  g <- function(x)
> >  {
> >   sum(x + a)# Say. Use 'a'; does not change 'a'.
>
> The expression 'x + a' causes 'a' to be duplicated; 'x' is added to each
> element of the duplicated vector, then returned. The sum occurs
> afterward. To avoid this use an expression like: 'length(a) * x +
> sum(a)'. Also, please see this recent thread regarding the
> pass-by-value / pass-by-reference issue:
> http://tolstoy.newcastle.edu.au/R/e13/help/11/03/6632.html
>
> >  }
> >
> >  optimize(f = g, lower = 0, upper = 1)
> > }
> >
> >
> > f2 <- function()
> > {
> > b <- runif(1000)   # Create big object.
> >
> > f1(a = b)
> > }
> >
> >
> > My main concern is to reduce copying of the big object 'a'. Questions:
> >
> > (1) In f1, 'a' never appears on the LHS of assignment. Is it passed by
> value
> > or by reference? Say the situation is simpler and more general: no
> > optimization call in f1.
>
> 'a' is passed by value, but not necessarily copied in memory.
>
> > (2) Is there any difference, as far as copying of the big 'a' is
> concerned,
> > if 'g' is changed to
> >g <- function(x, b)  { sum(x + b) }
> > and called by
> > optimize(f = g, lower = 0, upper = 1, b = a)
>
> No.
>
> > (3) Is 'a' passed into the C optimization function one-off, or again and
> > again across the C-R interface?
>
> I don't think either is completely correct. But more to your point, 'a'
> is not necessarily copied repeatedly. If you make the substitution I
> suggested above for 'g', then 'a' is not repeatedly copied.
>
> > (4) Does it help if I remove the argument 'a' of 'f1', and let 'g' look
> for
> > it (of course it should be referred to as 'b' now) directly in the
> > environment of 'f2'?
>
> No. 'g' would then search and find 'a' farther down the environment
> tree.
>
> > (5) Any suggestions?
>
> Avoid operations that necessitate a copy. Compile R with
> --enable-memory-profiling and use the tracemem function to help in this.
>
> > Many thanks for your help!
> >
> > Zepu
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] avoid copying big object passed into optimize()

2011-03-09 Thread Zepu Zhang
Hello list,

I have the following scenario:

f1 <- function(a)
{
  # doing things; may need 'a', but does not change 'a'.

 g <- function(x)
 {
  sum(x + a)# Say. Use 'a'; does not change 'a'.
 }

 optimize(f = g, lower = 0, upper = 1)
}


f2 <- function()
{
b <- runif(1000)   # Create big object.

f1(a = b)
}


My main concern is to reduce copying of the big object 'a'. Questions:

(1) In f1, 'a' never appears on the LHS of assignment. Is it passed by value
or by reference? Say the situation is simpler and more general: no
optimization call in f1.

(2) Is there any difference, as far as copying of the big 'a' is concerned,
if 'g' is changed to
   g <- function(x, b)  { sum(x + b) }
and called by
optimize(f = g, lower = 0, upper = 1, b = a)

(3) Is 'a' passed into the C optimization function one-off, or again and
again across the C-R interface?

(4) Does it help if I remove the argument 'a' of 'f1', and let 'g' look for
it (of course it should be referred to as 'b' now) directly in the
environment of 'f2'?

(5) Any suggestions?

Many thanks for your help!

Zepu

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] A Call for a Smaller R Core Package

2006-09-20 Thread Zepu Zhang
(Below is my idea on an issue that has troubled me for a fairly long time. I
hope it's not viewed as trouble making.)

A Call for a Smaller R Core Package

This document suggests downsizing the 'core' package of R
by taking out some specialized functionalities to form
their own packages. I'll use string related functions as examples,
because I happened to be troubled by them today.

1. The core is too big

R is a function rich environment.
However, non-central functions are better organized in specialized packages.
>From time to time I felt the need to go through the core package for a
complete picture of what are there at my disposal,
yet so far I haven't done that.
In the 'R Reference Manual' the core package runs for over 400 pages
with about 400 entries, and mysteriously some functions don't show up
in the TOC, e.g. 'sub'.
In the two-volume reference set printed by Network-Theory,
the core is the entire first book.
In contrast, the 'Intrinsic Functions' chapter of the classic Fortran
reference "Fortran 95/2003 Explained" runs for maybe 30(?) pages.
I flipped through it many times and I can say with confidence,
"OK these are ALL the Fortran intrinsics and I know what they do."
For R, I found it an intimidating task to flip through the 400+ pages core
and retain a clear mind at the end.

Below is a random sample of string related functions in the core package:

agrep
basename
charmatch
chartr
gregexpr
grep
gsub
regex
regexpr
strsplit
strtrim
strwrap
sub

In my opinion, anything that uses regular expressions belongs somewhere else.
Even 'utils' seems to be a better place for random items than the 'core'.

2. Benefits of a smaller core

a) A smaller core will be more carefully studied and better appreciated.

If the R core functions were documented in 100 pages,
I would be a much better R programmer than I am today
because I would have singled out and studied the more fundamental routines
about function calls, etc.

The criteria for a function to be in the core seem to be: 1) fundamental; or
2) very often used.

A smaller core is more stable.

b) A specialized 'string' package makes string related functions much easier
to find.

It could be that I still need all the functions.
But since they are grouped together, it greatly helps learning.
I would be very rarely reinventing the wheel, because
I could quickly get a sweeping view of the dedicated package.

c) It will be easier to enrich string-related functionalities without
perplexing the core.

3. Costs of such re-arrangements

a) To the R development team

(I don't really know.)

For those utility functions that are frequently used in basic functions,
they may well stay in the core.
For those that are not, it may not be too difficult to move them around.
The spin-off package may be always automatically loaded as a basic one,
but as discussed above, a cleaning grouping greatly helps learning
and finding things.

b) To R users

The system (both the core and the specialized package) will be easier to learn
and use.


-- Zepu Zhang, [EMAIL PROTECTED]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel