How can I copy distinct blocks of data to each process?

On Mon, Oct 14, 2013 at 10:21 PM, Jeff Newmiller
<jdnew...@dcn.davis.ca.us> wrote:
> The session info is helpful. To the best of my knowledge there is no easy way 
> to share memory between R processes other than forking. You can use 
> clusterExport to make "global" copies of large data structures in each 
> process and pass index values to your function to reduce copy costs at a 
> price of extra data copies in each process that won't be used. Or you can 
> copy distinct blocks of data to each process and use single threaded 
> processing to loop over the blocks within the workers to reduce the number of 
> calls to workers. However I don't claim to be an expert with the parallel 
> package, so others may have better advice.  However, with two cores I don't 
> usually get better than a 30% speedup... the best payoff comes with four or 
> more workers working.
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>                                       Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> Jeffrey Flint <jeffrey.fl...@gmail.com> wrote:
>>Jeff:
>>
>>Thank you for your response.  Please let me know how I can
>>"unhandicap" my question.  I tried my best to be concise.  Maybe this
>>will help:
>>
>>> version
>>               _
>>platform       i386-w64-mingw32
>>arch           i386
>>os             mingw32
>>system         i386, mingw32
>>status
>>major          3
>>minor          0.2
>>year           2013
>>month          09
>>day            25
>>svn rev        63987
>>language       R
>>version.string R version 3.0.2 (2013-09-25)
>>nickname       Frisbee Sailing
>>
>>
>>I understand your comment about forking.  You are right that forking
>>is not available on windows.
>>
>>What I am curious about is whether or not I can direct the execution
>>of the parallel package's functions to diminish the overhead.  My
>>guess is that there is overhead in copying the function to be executed
>>at each iteration and there is overhead in copying the data to be used
>>at each iteration.  Are there any paradigms in the package parallel to
>>reduce these overheads?  For instance, I could use clusterExport to
>>establish the function to be called.  But I don't know if there is a
>>technique whereby I could point to the data to be used by each CPU so
>>as to prevent a copy.
>>
>>Jeff
>>
>>
>>
>>On Mon, Oct 14, 2013 at 2:35 PM, Jeff Newmiller
>><jdnew...@dcn.davis.ca.us> wrote:
>>> Your question misses on several points in the Posting Guide so any
>>answers are handicapped by you.
>>>
>>> There is an overhead in using parallel processing, and the value of
>>two cores is marginal at best. In general parallel by forking is more
>>efficient than parallel by SNOW, but the former is not available on all
>>operating systems. This is discussed in the vignette for the parallel
>>package.
>>>
>>---------------------------------------------------------------------------
>>> Jeff Newmiller                        The     .....       .....  Go
>>Live...
>>> DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
>>Go...
>>>                                       Live:   OO#.. Dead: OO#..
>>Playing
>>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>>> /Software/Embedded Controllers)               .OO#.       .OO#.
>>rocks...1k
>>>
>>---------------------------------------------------------------------------
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> Jeffrey Flint <jeffrey.fl...@gmail.com> wrote:
>>>>I'm running package parallel in R-3.0.2.
>>>>
>>>>Below are the execution times using system.time for when executing
>>>>serially versus in parallel (with 2 cores) using parRapply.
>>>>
>>>>
>>>>Serially:
>>>>   user  system elapsed
>>>>   4.67    0.03    4.71
>>>>
>>>>
>>>>
>>>>Using package parallel:
>>>>   user  system elapsed
>>>>   3.82    0.12    6.50
>>>>
>>>>
>>>>
>>>>There is evident improvement in the user cpu time, but a big jump in
>>>>the elapsed time.
>>>>
>>>>In my code, I am executing a function on a 1000 row matrix 100 times,
>>>>with the data different each time of course.
>>>>
>>>>The initial call to makeCluster cost 1.25 seconds in elapsed time.
>>>>I'm not concerned about the makeCluster time since that is a fixed
>>>>cost.  I am concerned about the additional 1.43 seconds in elapsed
>>>>time (6.50=1.43+1.25).
>>>>
>>>>I am wondering if there is a way to structure the code to avoid
>>>>largely avoid the 1.43 second overhead.  For instance, perhaps I
>>could
>>>>upload the function to both cores manually in order to avoid the
>>>>function being uploaded at each of the 100 iterations?    Also, I am
>>>>wondering if there is a way to avoid any copying that is occurring at
>>>>each of the 100 iterations?
>>>>
>>>>
>>>>Thank you.
>>>>
>>>>Jeff Flint
>>>>
>>>>______________________________________________
>>>>R-help@r-project.org mailing list
>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide
>>>>http://www.R-project.org/posting-guide.html
>>>>and provide commented, minimal, self-contained, reproducible code.
>>>
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to