Re: [R] project parallel help

2013-10-15 Thread Jeff Newmiller
The session info is helpful. To the best of my knowledge there is no easy way 
to share memory between R processes other than forking. You can use 
clusterExport to make global copies of large data structures in each process 
and pass index values to your function to reduce copy costs at a price of extra 
data copies in each process that won't be used. Or you can copy distinct blocks 
of data to each process and use single threaded processing to loop over the 
blocks within the workers to reduce the number of calls to workers. However I 
don't claim to be an expert with the parallel package, so others may have 
better advice.  However, with two cores I don't usually get better than a 30% 
speedup... the best payoff comes with four or more workers working.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Jeffrey Flint jeffrey.fl...@gmail.com wrote:
Jeff:

Thank you for your response.  Please let me know how I can
unhandicap my question.  I tried my best to be concise.  Maybe this
will help:

 version
   _
platform   i386-w64-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  3
minor  0.2
year   2013
month  09
day25
svn rev63987
language   R
version.string R version 3.0.2 (2013-09-25)
nickname   Frisbee Sailing


I understand your comment about forking.  You are right that forking
is not available on windows.

What I am curious about is whether or not I can direct the execution
of the parallel package's functions to diminish the overhead.  My
guess is that there is overhead in copying the function to be executed
at each iteration and there is overhead in copying the data to be used
at each iteration.  Are there any paradigms in the package parallel to
reduce these overheads?  For instance, I could use clusterExport to
establish the function to be called.  But I don't know if there is a
technique whereby I could point to the data to be used by each CPU so
as to prevent a copy.

Jeff



On Mon, Oct 14, 2013 at 2:35 PM, Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:
 Your question misses on several points in the Posting Guide so any
answers are handicapped by you.

 There is an overhead in using parallel processing, and the value of
two cores is marginal at best. In general parallel by forking is more
efficient than parallel by SNOW, but the former is not available on all
operating systems. This is discussed in the vignette for the parallel
package.

---
 Jeff NewmillerThe .   .  Go
Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
Go...
   Live:   OO#.. Dead: OO#.. 
Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#. 
rocks...1k

---
 Sent from my phone. Please excuse my brevity.

 Jeffrey Flint jeffrey.fl...@gmail.com wrote:
I'm running package parallel in R-3.0.2.

Below are the execution times using system.time for when executing
serially versus in parallel (with 2 cores) using parRapply.


Serially:
   user  system elapsed
   4.670.034.71



Using package parallel:
   user  system elapsed
   3.820.126.50



There is evident improvement in the user cpu time, but a big jump in
the elapsed time.

In my code, I am executing a function on a 1000 row matrix 100 times,
with the data different each time of course.

The initial call to makeCluster cost 1.25 seconds in elapsed time.
I'm not concerned about the makeCluster time since that is a fixed
cost.  I am concerned about the additional 1.43 seconds in elapsed
time (6.50=1.43+1.25).

I am wondering if there is a way to structure the code to avoid
largely avoid the 1.43 second overhead.  For instance, perhaps I
could
upload the function to both cores manually in order to avoid the
function being uploaded at each of the 100 iterations?Also, I am
wondering if there is a way to avoid any copying that is occurring at
each of the 100 iterations?


Thank you.

Jeff Flint

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide 

Re: [R] project parallel help

2013-10-15 Thread Jeffrey Flint
How can I copy distinct blocks of data to each process?

On Mon, Oct 14, 2013 at 10:21 PM, Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:
 The session info is helpful. To the best of my knowledge there is no easy way 
 to share memory between R processes other than forking. You can use 
 clusterExport to make global copies of large data structures in each 
 process and pass index values to your function to reduce copy costs at a 
 price of extra data copies in each process that won't be used. Or you can 
 copy distinct blocks of data to each process and use single threaded 
 processing to loop over the blocks within the workers to reduce the number of 
 calls to workers. However I don't claim to be an expert with the parallel 
 package, so others may have better advice.  However, with two cores I don't 
 usually get better than a 30% speedup... the best payoff comes with four or 
 more workers working.
 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.

 Jeffrey Flint jeffrey.fl...@gmail.com wrote:
Jeff:

Thank you for your response.  Please let me know how I can
unhandicap my question.  I tried my best to be concise.  Maybe this
will help:

 version
   _
platform   i386-w64-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  3
minor  0.2
year   2013
month  09
day25
svn rev63987
language   R
version.string R version 3.0.2 (2013-09-25)
nickname   Frisbee Sailing


I understand your comment about forking.  You are right that forking
is not available on windows.

What I am curious about is whether or not I can direct the execution
of the parallel package's functions to diminish the overhead.  My
guess is that there is overhead in copying the function to be executed
at each iteration and there is overhead in copying the data to be used
at each iteration.  Are there any paradigms in the package parallel to
reduce these overheads?  For instance, I could use clusterExport to
establish the function to be called.  But I don't know if there is a
technique whereby I could point to the data to be used by each CPU so
as to prevent a copy.

Jeff



On Mon, Oct 14, 2013 at 2:35 PM, Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:
 Your question misses on several points in the Posting Guide so any
answers are handicapped by you.

 There is an overhead in using parallel processing, and the value of
two cores is marginal at best. In general parallel by forking is more
efficient than parallel by SNOW, but the former is not available on all
operating systems. This is discussed in the vignette for the parallel
package.

---
 Jeff NewmillerThe .   .  Go
Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
Go...
   Live:   OO#.. Dead: OO#..
Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.
rocks...1k

---
 Sent from my phone. Please excuse my brevity.

 Jeffrey Flint jeffrey.fl...@gmail.com wrote:
I'm running package parallel in R-3.0.2.

Below are the execution times using system.time for when executing
serially versus in parallel (with 2 cores) using parRapply.


Serially:
   user  system elapsed
   4.670.034.71



Using package parallel:
   user  system elapsed
   3.820.126.50



There is evident improvement in the user cpu time, but a big jump in
the elapsed time.

In my code, I am executing a function on a 1000 row matrix 100 times,
with the data different each time of course.

The initial call to makeCluster cost 1.25 seconds in elapsed time.
I'm not concerned about the makeCluster time since that is a fixed
cost.  I am concerned about the additional 1.43 seconds in elapsed
time (6.50=1.43+1.25).

I am wondering if there is a way to structure the code to avoid
largely avoid the 1.43 second overhead.  For instance, perhaps I
could
upload the function to both cores manually in order to avoid the
function being uploaded at each of the 100 iterations?Also, I am
wondering if there is a way to avoid any copying that is occurring at
each of the 100 iterations?


Thank you.

Jeff Flint

__

Re: [R] project parallel help

2013-10-15 Thread Jeff Newmiller
As parameters. For example, if you have 100 simulations, set up a list of 4 
distinct sets of data (1:25, 26:50, etc) and call the single-threaded 
processing function from parLapply iterated four times. Then each instance of 
the processing function won't return until it has completed 25 simulations.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Jeffrey Flint jeffrey.fl...@gmail.com wrote:
How can I copy distinct blocks of data to each process?

On Mon, Oct 14, 2013 at 10:21 PM, Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:
 The session info is helpful. To the best of my knowledge there is no
easy way to share memory between R processes other than forking. You
can use clusterExport to make global copies of large data structures
in each process and pass index values to your function to reduce copy
costs at a price of extra data copies in each process that won't be
used. Or you can copy distinct blocks of data to each process and use
single threaded processing to loop over the blocks within the workers
to reduce the number of calls to workers. However I don't claim to be
an expert with the parallel package, so others may have better advice. 
However, with two cores I don't usually get better than a 30%
speedup... the best payoff comes with four or more workers working.

---
 Jeff NewmillerThe .   .  Go
Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
Go...
   Live:   OO#.. Dead: OO#.. 
Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#. 
rocks...1k

---
 Sent from my phone. Please excuse my brevity.

 Jeffrey Flint jeffrey.fl...@gmail.com wrote:
Jeff:

Thank you for your response.  Please let me know how I can
unhandicap my question.  I tried my best to be concise.  Maybe this
will help:

 version
   _
platform   i386-w64-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  3
minor  0.2
year   2013
month  09
day25
svn rev63987
language   R
version.string R version 3.0.2 (2013-09-25)
nickname   Frisbee Sailing


I understand your comment about forking.  You are right that forking
is not available on windows.

What I am curious about is whether or not I can direct the execution
of the parallel package's functions to diminish the overhead.  My
guess is that there is overhead in copying the function to be
executed
at each iteration and there is overhead in copying the data to be
used
at each iteration.  Are there any paradigms in the package parallel
to
reduce these overheads?  For instance, I could use clusterExport to
establish the function to be called.  But I don't know if there is a
technique whereby I could point to the data to be used by each CPU so
as to prevent a copy.

Jeff



On Mon, Oct 14, 2013 at 2:35 PM, Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:
 Your question misses on several points in the Posting Guide so any
answers are handicapped by you.

 There is an overhead in using parallel processing, and the value of
two cores is marginal at best. In general parallel by forking is more
efficient than parallel by SNOW, but the former is not available on
all
operating systems. This is discussed in the vignette for the parallel
package.

---
 Jeff NewmillerThe .   .  Go
Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#. 
Live
Go...
   Live:   OO#.. Dead: OO#..
Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#. 
with
 /Software/Embedded Controllers)   .OO#.   .OO#.
rocks...1k

---
 Sent from my phone. Please excuse my brevity.

 Jeffrey Flint jeffrey.fl...@gmail.com wrote:
I'm running package parallel in R-3.0.2.

Below are the execution times using system.time for when executing
serially versus in parallel (with 2 cores) using parRapply.


Serially:
   user  system elapsed
   4.670.034.71



Using package parallel:
   user  system elapsed
   3.820.12   

Re: [R] project parallel help

2013-10-14 Thread Jeff Newmiller
Your question misses on several points in the Posting Guide so any answers are 
handicapped by you.

There is an overhead in using parallel processing, and the value of two cores 
is marginal at best. In general parallel by forking is more efficient than 
parallel by SNOW, but the former is not available on all operating systems. 
This is discussed in the vignette for the parallel package.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Jeffrey Flint jeffrey.fl...@gmail.com wrote:
I'm running package parallel in R-3.0.2.

Below are the execution times using system.time for when executing
serially versus in parallel (with 2 cores) using parRapply.


Serially:
   user  system elapsed
   4.670.034.71



Using package parallel:
   user  system elapsed
   3.820.126.50



There is evident improvement in the user cpu time, but a big jump in
the elapsed time.

In my code, I am executing a function on a 1000 row matrix 100 times,
with the data different each time of course.

The initial call to makeCluster cost 1.25 seconds in elapsed time.
I'm not concerned about the makeCluster time since that is a fixed
cost.  I am concerned about the additional 1.43 seconds in elapsed
time (6.50=1.43+1.25).

I am wondering if there is a way to structure the code to avoid
largely avoid the 1.43 second overhead.  For instance, perhaps I could
upload the function to both cores manually in order to avoid the
function being uploaded at each of the 100 iterations?Also, I am
wondering if there is a way to avoid any copying that is occurring at
each of the 100 iterations?


Thank you.

Jeff Flint

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] project parallel help

2013-10-14 Thread Jeffrey Flint
Jeff:

Thank you for your response.  Please let me know how I can
unhandicap my question.  I tried my best to be concise.  Maybe this
will help:

 version
   _
platform   i386-w64-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  3
minor  0.2
year   2013
month  09
day25
svn rev63987
language   R
version.string R version 3.0.2 (2013-09-25)
nickname   Frisbee Sailing


I understand your comment about forking.  You are right that forking
is not available on windows.

What I am curious about is whether or not I can direct the execution
of the parallel package's functions to diminish the overhead.  My
guess is that there is overhead in copying the function to be executed
at each iteration and there is overhead in copying the data to be used
at each iteration.  Are there any paradigms in the package parallel to
reduce these overheads?  For instance, I could use clusterExport to
establish the function to be called.  But I don't know if there is a
technique whereby I could point to the data to be used by each CPU so
as to prevent a copy.

Jeff



On Mon, Oct 14, 2013 at 2:35 PM, Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:
 Your question misses on several points in the Posting Guide so any answers 
 are handicapped by you.

 There is an overhead in using parallel processing, and the value of two cores 
 is marginal at best. In general parallel by forking is more efficient than 
 parallel by SNOW, but the former is not available on all operating systems. 
 This is discussed in the vignette for the parallel package.
 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.

 Jeffrey Flint jeffrey.fl...@gmail.com wrote:
I'm running package parallel in R-3.0.2.

Below are the execution times using system.time for when executing
serially versus in parallel (with 2 cores) using parRapply.


Serially:
   user  system elapsed
   4.670.034.71



Using package parallel:
   user  system elapsed
   3.820.126.50



There is evident improvement in the user cpu time, but a big jump in
the elapsed time.

In my code, I am executing a function on a 1000 row matrix 100 times,
with the data different each time of course.

The initial call to makeCluster cost 1.25 seconds in elapsed time.
I'm not concerned about the makeCluster time since that is a fixed
cost.  I am concerned about the additional 1.43 seconds in elapsed
time (6.50=1.43+1.25).

I am wondering if there is a way to structure the code to avoid
largely avoid the 1.43 second overhead.  For instance, perhaps I could
upload the function to both cores manually in order to avoid the
function being uploaded at each of the 100 iterations?Also, I am
wondering if there is a way to avoid any copying that is occurring at
each of the 100 iterations?


Thank you.

Jeff Flint

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.