Re: [Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package

2017-10-10 Thread Thanh Le Hoang
Hello, you can find a text copy of the previous emails below.
I have already found a solution for my problem, but thanks for your reply.

Thanh

> Gesendet: Dienstag, 10. Oktober 2017 um 13:21 Uhr
> Von: "Dirk Eddelbuettel" <e...@debian.org>
> An: "Thanh Le Hoang" <thanh_le_ho...@web.de>
> Cc: Rcpp-devel@lists.r-forge.r-project.org
> Betreff: Re: [Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) 
> in a package
>
> 
> On 10 October 2017 at 12:40, Thanh Le Hoang wrote:
> | [DELETED ATTACHMENT , HTML]
> 
> Can you please try again in text mode?
> 
> Dirk
> 
> -- 
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> 


> Replying to my own email since I just found the solution. I somehow screwed 
> up the 
> Makevars/Makevars.win files, so I deleted them and created new files where I 
> exactly copied the Makevars lines on the RcppParallel webpage. I also had to 
> add
> #' @importFrom RcppParallel RcppParallelLibs
> to my package so that there were no errors with the NAMESPACE file when 
> running 
> roxygen2.

> > Gesendet: Montag, 09. Oktober 2017 um 00:22 Uhr
> > Von: "Thanh Le Hoang" <thanh_le_ho...@web.de>
> > An: rcpp-devel@lists.r-forge.r-project.org
> > Betreff: [Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) 
> > in a package
> > Hello,
> >  
> > I'm writing my first package for a machine learning algorithm called 
> > self-organizing 
> > map where I use compiled code (with Rcpp) and parallelization 
> > (RcppParallel).
> > My computer uses Windows 10 (64 bit, 8 GB RAM) and I currently have a 
> > problem 
> > with the memory usage (shown in the Windows task manager) which keeps going 
> > up the 
> > longer the algorithm runs. The usage doesn't increase immediately, but 
> > after a couple
> > of seconds and I only noticed it when I tried larger data sets. The memory 
> > is 
> > only freed by terminating/restarting the R session.
> >  
> > What is somewhat strange is that the memory usage is not attributed to 
> > Rstudio or 
> > the R session (i.e. the memory usage in the task manager does not go up for 
> > the respective processes). According to RAMMap (which gives more 
> > information 
> > about memory usage on Windows) the used memory belongs to the "nonpaged 
> > pool". 
> > The RStudio profiler and lineprof did not seem to detect the memory leak 
> > (if 
> > I read the output correctly). So far I have rewritten parts of the C++ code 
> > to 
> > use references and pre-allocated memory, but it did not help.
> >  
> > The main function in the package calls several smaller functions written in 
> > C++ and it seems that all of those functions play a role here, but I have 
> > found 
> > a function where this problem occurs consistently. It calculates the 
> > (squared) 
> > euclidean norm for each row of a given matrix (in parallel) with a boolean 
> > vector (oldColumns) specifying which columns should be used/ignored during 
> > this 
> > calculation:
> > 
> > https://pastebin.com/qgyzx0M7
> > 
> > When I pasted this code into a new project, I have noticed that the problem 
> > only happens when I build (with devtools::build()) and install a package 
> > containing this function, regardless of whether I build a source package or 
> > a binary package. When I just sourceCpp a file with this function, no 
> > memory problems occur. So could this have anything to do with how I build 
> > packages? 
> > Until now I have followed the "R packages" book written by Hadley Wickham 
> > for this.
> > 
> > Here is some R code which generates some test data and calls the function.
> > 
> > https://pastebin.com/c0RaeW9K
> > 
> > Everytime I run this code (which takes a couple of minutes), the memory 
> > usage 
> > goes up by 4% - 6% which makes my package unusable for larger sets of data.
> > I have been stuck on this problem for a week now and any help would be 
> > appreciated.
> > 
> > Thank you,
> > Thanh
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package

2017-10-10 Thread Dirk Eddelbuettel

On 10 October 2017 at 12:40, Thanh Le Hoang wrote:
| [DELETED ATTACHMENT , HTML]

Can you please try again in text mode?

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package

2017-10-10 Thread Thanh Le Hoang

Replying to my own email since I just found the solution. I somehow screwed up the Makevars/Makevars.win files, so I deleted them and created new files where I exactly copied the Makevars lines on the RcppParallel webpage. I also had to add

#' @importFrom RcppParallel RcppParallelLibs

to my package so that there were no errors with the NAMESPACE file when running roxygen2.
 

 

Gesendet: Montag, 09. Oktober 2017 um 00:22 Uhr
Von: "Thanh Le Hoang" <thanh_le_ho...@web.de>
An: rcpp-devel@lists.r-forge.r-project.org
Betreff: [Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package




Hello,

 

I'm writing my first package for a machine learning algorithm called self-organizing map where I use compiled code (with Rcpp) and parallelization (RcppParallel).
My computer uses Windows 10 (64 bit, 8 GB RAM) and I currently have a problem with the memory usage (shown in the Windows task manager) which keeps going up the longer the algorithm runs. The usage doesn't increase immediately, but after a couple of seconds and I only noticed it when I tried larger data sets. The memory is only freed by terminating/restarting the R session.

 

What is somewhat strange is that the memory usage is not attributed to Rstudio or the R session (i.e. the memory usage in the task manager does not go up for the respective processes). According to RAMMap (which gives more information about memory usage on Windows) the used memory belongs to the "nonpaged pool". The RStudio profiler and lineprof did not seem to detect the memory leak (if I read the output correctly). So far I have rewritten parts of the C++ code to use references and pre-allocated memory, but it did not help.

 

The main function in the package calls several smaller functions written in C++ and it seems that all of those functions play a role here, but I have found a function where this problem occurs consistently. It calculates the (squared) euclidean norm for each row of a given matrix (in parallel) with a boolean vector (oldColumns) specifying which columns should be used/ignored during this calculation:

 

https://pastebin.com/qgyzx0M7

 

When I pasted this code into a new project, I have noticed that the problem only happens when I build (with devtools::build()) and install a package containing this function, regardless of whether I build a source package or a binary package. When I just sourceCpp a file with this function, no memory problems occur. So could this have anything to do with how I build packages? Until now I have followed the "R packages" book written by Hadley Wickham for this.

 

Here is some R code which generates some test data and calls the function.

 

https://pastebin.com/c0RaeW9K

 

Everytime I run this code (which takes a couple of minutes), the memory usage goes up by 4% - 6% which makes my package unusable for larger sets of data.
I have been stuck on this problem for a week now and any help would be appreciated.

 

Thank you,
Thanh


___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel



___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

[Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package

2017-10-08 Thread Thanh Le Hoang

Hello,

 

I'm writing my first package for a machine learning algorithm called self-organizing map where I use compiled code (with Rcpp) and parallelization (RcppParallel).
My computer uses Windows 10 (64 bit, 8 GB RAM) and I currently have a problem with the memory usage (shown in the Windows task manager) which keeps going up the longer the algorithm runs. The usage doesn't increase immediately, but after a couple of seconds and I only noticed it when I tried larger data sets. The memory is only freed by terminating/restarting the R session.

 

What is somewhat strange is that the memory usage is not attributed to Rstudio or the R session (i.e. the memory usage in the task manager does not go up for the respective processes). According to RAMMap (which gives more information about memory usage on Windows) the used memory belongs to the "nonpaged pool". The RStudio profiler and lineprof did not seem to detect the memory leak (if I read the output correctly). So far I have rewritten parts of the C++ code to use references and pre-allocated memory, but it did not help.

 

The main function in the package calls several smaller functions written in C++ and it seems that all of those functions play a role here, but I have found a function where this problem occurs consistently. It calculates the (squared) euclidean norm for each row of a given matrix (in parallel) with a boolean vector (oldColumns) specifying which columns should be used/ignored during this calculation:

 

https://pastebin.com/qgyzx0M7

 

When I pasted this code into a new project, I have noticed that the problem only happens when I build (with devtools::build()) and install a package containing this function, regardless of whether I build a source package or a binary package. When I just sourceCpp a file with this function, no memory problems occur. So could this have anything to do with how I build packages? Until now I have followed the "R packages" book written by Hadley Wickham for this.

 

Here is some R code which generates some test data and calls the function.

 

https://pastebin.com/c0RaeW9K

 

Everytime I run this code (which takes a couple of minutes), the memory usage goes up by 4% - 6% which makes my package unusable for larger sets of data.
I have been stuck on this problem for a week now and any help would be appreciated.

 

Thank you,
Thanh

___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel