RE: [R] Windows Memory Issues

2003-12-10 Thread JFRI (Jesper Frickmann)
I recommend you get the latest version 1.8.1 beta. I also had some
memory problems and that fixed it.

Kind regards, 
Jesper Frickmann 
Statistician, Quality Control 
Novozymes North America Inc. 
Tel. +1 919 494 3266
Fax +1 919 494 3460

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Windows Memory Issues

2003-12-10 Thread Duncan Murdoch
On Wed, 10 Dec 2003 10:43:12 -0500, JFRI (Jesper Frickmann)
[EMAIL PROTECTED] wrote :

I recommend you get the latest version 1.8.1 beta. I also had some
memory problems and that fixed it.

1.8.1 has been released, so there's no more beta.  The release is
available on CRAN (see http://cran.r-project.org/bin/windows/base for
the Windows binary build). A patched version, occasionally updated on
my web site (binary occasionally updated as
http://www.stats.uwo.ca/faculty/murdoch/software/r-devel/rw1081pat.exe)
is also worth looking at if the release still has the bug.

Duncan Murdoch

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Windows Memory Issues

2003-12-09 Thread Benjamin . STABLER
I would also like some clarification about R memory management.  Like Doug,
I didn't find anything about consecutive calls to gc() to free more memory.
We run into memory limit problems every now and then and a better
understanding of R's memory management would go a long way.  I am interested
in learning more and was wondering if there is any specific R documentation
that explains R's memory usage?  Or maybe some good links about memory and
garbage collection.  Thanks.

Benjamin Stabler
Transportation Planning Analysis Unit
Oregon Department of Transportation
555 13th Street NE, Suite 2
Salem, OR 97301  Ph: 503-986-4104

---

Message: 21
Date: Mon, 8 Dec 2003 09:51:12 -0800 (PST)
From: Douglas Grove [EMAIL PROTECTED]
Subject: Re: [R] Windows Memory Issues
To: Prof Brian Ripley [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Message-ID:
[EMAIL PROTECTED]
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Sat, 6 Dec 2003, Prof Brian Ripley wrote:

 I think you misunderstand how R uses memory.  gc() does not free up all 
 the memory used for the objects it frees, and repeated calls will free 
 more.  Don't speculate about how memory management works: do your 
 homework!

Are you saying that consecutive calls to gc() will free more memory than
a single call, or am I misunderstanding?   Reading ?gc and ?Memory I don't
see anything about this mentioned.  Where should I be looking to find 
more comprehensive info on R's memory management??  I'm not writing any
packages, just would like to have a better handle on efficiently using
memory as it is usually the limiting factor with R.  FYI, I'm running
R1.8.1 and RedHat9 on a P4 with 2GB of RAM in case there is any platform
specific info that may be applicable.

Thanks,

Doug Grove
Statistical Research Associate
Fred Hutchinson Cancer Research Center


 In any case, you are using an outdated version of R, and your first
 course of action should be to compile up R-devel and try that, as there 
 has been improvements to memory management under Windows.  You could also 
 try compiling using the native malloc (and that *is* described in the 
 INSTALL file) as that has different compromises.
 
 
 On Sat, 6 Dec 2003, Richard Pugh wrote:
 
  Hi all,
   
  I am currently building an application based on R 1.7.1 (+ compiled
  C/C++ code + MySql + VB).  I am building this application to work on 2
  different platforms (Windows XP Professional (500mb memory) and Windows
  NT 4.0 with service pack 6 (1gb memory)).  This is a very memory
  intensive application performing sophisticated operations on large
  matrices (typically 5000x1500 matrices).
   
  I have run into some issues regarding the way R handles its memory,
  especially on NT.  In particular, R does not seem able to recollect some
  of the memory used following the creation and manipulation of large data
  objects.  For example, I have a function which receives a (large)
  numeric matrix, matches against more data (maybe imported from MySql)
  and returns a large list structure for further analysis.  A typical call
  may look like this .
   
   myInputData - matrix(sample(1:100, 750, T), nrow=5000)
   myPortfolio - createPortfolio(myInputData)
   
  It seems I can only repeat this code process 2/3 times before I have to
  restart R (to get the memory back).  I use the same object names
  (myInputData and myPortfolio) each time, so I am not create more large
  objects ..
   
  I think the problems I have are illustrated with the following example
  from a small R session .
   
   # Memory usage for Rui process = 19,800
   testData - matrix(rnorm(1000), 1000) # Create big matrix
   # Memory usage for Rgui process = 254,550k
   rm(testData)
   # Memory usage for Rgui process = 254,550k
   gc()
   used (Mb) gc trigger  (Mb)
  Ncells 369277  9.9 667722  17.9
  Vcells  87650  0.7   24286664 185.3
   # Memory usage for Rgui process = 20,200k
   
  In the above code, R cannot recollect all memory used, so the memory
  usage increases from 19.8k to 20.2.  However, the following example is
  more typical of the environments I use .
   
   # Memory 128,100k
   myTestData - matrix(rnorm(1000), 1000)
   # Memory 357,272k
   rm(myTestData)
   # Memory 357,272k
   gc()
used (Mb) gc trigger  (Mb)
  Ncells  478197 12.8 818163  21.9
  Vcells 9309525 71.1   31670210 241.7
   # Memory 279,152k
   
  Here, the memory usage increases from 128.1k to 279.1k
   
  Could anyone point out what I could do to rectify this (if anything), or
  generally what strategy I could take to improve this?
   
  Many thanks,
  Rich.
   
  Mango Solutions
  Tel : (01628) 418134
  Mob : (07967) 808091
   
  
  [[alternative HTML version deleted]]
  
  __
  [EMAIL PROTECTED] mailing list
  https://www.stat.math.ethz.ch/mailman/listinfo/r-help
  
  
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied

Re: [R] Windows Memory Issues

2003-12-09 Thread Prof Brian Ripley
On Tue, 9 Dec 2003 [EMAIL PROTECTED] wrote:

 I would also like some clarification about R memory management.  Like Doug,
 I didn't find anything about consecutive calls to gc() to free more memory.

It was a statement about Windows, and about freeing memory *to Windows*.
Douglas Grove apparently had misread both the subject line and the 
sentence.

 We run into memory limit problems every now and then and a better
 understanding of R's memory management would go a long way.  I am interested
 in learning more and was wondering if there is any specific R documentation
 that explains R's memory usage?  Or maybe some good links about memory and
 garbage collection.  Thanks.

There are lots of comments in the source files.  And as I already said 
(but has been excised below), this is not relevant to the next version of 
R anyway.

BTW, the message below has been selectively edited, so please consult the 
original.

 Message: 21
 Date: Mon, 8 Dec 2003 09:51:12 -0800 (PST)
 From: Douglas Grove [EMAIL PROTECTED]
 Subject: Re: [R] Windows Memory Issues
 To: Prof Brian Ripley [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Message-ID:
   [EMAIL PROTECTED]
 Content-Type: TEXT/PLAIN; charset=US-ASCII
 
 On Sat, 6 Dec 2003, Prof Brian Ripley wrote:
 
  I think you misunderstand how R uses memory.  gc() does not free up all 
  the memory used for the objects it frees, and repeated calls will free 
  more.  Don't speculate about how memory management works: do your 
  homework!
 
 Are you saying that consecutive calls to gc() will free more memory than
 a single call, or am I misunderstanding?   Reading ?gc and ?Memory I don't
 see anything about this mentioned.  Where should I be looking to find 
 more comprehensive info on R's memory management??  I'm not writing any
 packages, just would like to have a better handle on efficiently using
 memory as it is usually the limiting factor with R.  FYI, I'm running
 R1.8.1 and RedHat9 on a P4 with 2GB of RAM in case there is any platform
 specific info that may be applicable.
 
 Thanks,
 
 Doug Grove
 Statistical Research Associate
 Fred Hutchinson Cancer Research Center
 
 
  In any case, you are using an outdated version of R, and your first
  course of action should be to compile up R-devel and try that, as there 
  has been improvements to memory management under Windows.  You could also 
  try compiling using the native malloc (and that *is* described in the 
  INSTALL file) as that has different compromises.
  
  
  On Sat, 6 Dec 2003, Richard Pugh wrote:
  
   Hi all,

   I am currently building an application based on R 1.7.1 (+ compiled
   C/C++ code + MySql + VB).  I am building this application to work on 2
   different platforms (Windows XP Professional (500mb memory) and Windows
   NT 4.0 with service pack 6 (1gb memory)).  This is a very memory
   intensive application performing sophisticated operations on large
   matrices (typically 5000x1500 matrices).

   I have run into some issues regarding the way R handles its memory,
   especially on NT.  In particular, R does not seem able to recollect some
   of the memory used following the creation and manipulation of large data
   objects.  For example, I have a function which receives a (large)
   numeric matrix, matches against more data (maybe imported from MySql)
   and returns a large list structure for further analysis.  A typical call
   may look like this .

myInputData - matrix(sample(1:100, 750, T), nrow=5000)
myPortfolio - createPortfolio(myInputData)

   It seems I can only repeat this code process 2/3 times before I have to
   restart R (to get the memory back).  I use the same object names
   (myInputData and myPortfolio) each time, so I am not create more large
   objects ..

   I think the problems I have are illustrated with the following example
   from a small R session .

# Memory usage for Rui process = 19,800
testData - matrix(rnorm(1000), 1000) # Create big matrix
# Memory usage for Rgui process = 254,550k
rm(testData)
# Memory usage for Rgui process = 254,550k
gc()
used (Mb) gc trigger  (Mb)
   Ncells 369277  9.9 667722  17.9
   Vcells  87650  0.7   24286664 185.3
# Memory usage for Rgui process = 20,200k

   In the above code, R cannot recollect all memory used, so the memory
   usage increases from 19.8k to 20.2.  However, the following example is
   more typical of the environments I use .

# Memory 128,100k
myTestData - matrix(rnorm(1000), 1000)
# Memory 357,272k
rm(myTestData)
# Memory 357,272k
gc()
 used (Mb) gc trigger  (Mb)
   Ncells  478197 12.8 818163  21.9
   Vcells 9309525 71.1   31670210 241.7
# Memory 279,152k

   Here, the memory usage increases from 128.1k to 279.1k

   Could anyone point out what I could do to rectify this (if anything), or
   generally what strategy I could take to improve this?

   Many thanks,
   Rich

RE: [R] Windows Memory Issues

2003-12-09 Thread Pikounis, Bill

 [snipped]  Or maybe some good links 
 about memory and
 garbage collection. 

As is mentioned time-to-time on this list when the above subject comes up,
Windows memory is a complicated topic.  One open-source utility I have found
helpful to monitor memory when I work under XP is called RAMpage, authored
by John Fitzgibbon, and is available at 

http://www.jfitz.com/software/RAMpage/


In its FAQ / Help, it touches on a lot of general memory and resource
issues, which I found helpful to learn about.  

http://www.jfitz.com/software/RAMpage/RAMpage_FAQS.html

(Though the author clearly warns that its usefulness for freeing memory
may not be anymore than cosmetic on NT / 2000 / XP systems.)

Hope that helps.

Bill


Bill Pikounis, Ph.D.

Biometrics Research Department
Merck Research Laboratories
PO Box 2000, MailDrop RY33-300  
126 E. Lincoln Avenue
Rahway, New Jersey 07065-0900
USA

Phone: 732 594 3913
Fax: 732 594 1565


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 [EMAIL PROTECTED]
 Sent: Tuesday, December 09, 2003 12:09 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [R] Windows Memory Issues
 
 
 I would also like some clarification about R memory 
 management.  Like Doug,
 I didn't find anything about consecutive calls to gc() to 
 free more memory.
 We run into memory limit problems every now and then and a better
 understanding of R's memory management would go a long way.  
 I am interested
 in learning more and was wondering if there is any specific R 
 documentation
 that explains R's memory usage?  Or maybe some good links 
 about memory and
 garbage collection.  Thanks.
 
 Benjamin Stabler
 Transportation Planning Analysis Unit
 Oregon Department of Transportation
 555 13th Street NE, Suite 2
 Salem, OR 97301  Ph: 503-986-4104
 
 ---
 
 Message: 21
 Date: Mon, 8 Dec 2003 09:51:12 -0800 (PST)
 From: Douglas Grove [EMAIL PROTECTED]
 Subject: Re: [R] Windows Memory Issues
 To: Prof Brian Ripley [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Message-ID:
   [EMAIL PROTECTED]
 Content-Type: TEXT/PLAIN; charset=US-ASCII
 
 On Sat, 6 Dec 2003, Prof Brian Ripley wrote:
 
  I think you misunderstand how R uses memory.  gc() does not 
 free up all 
  the memory used for the objects it frees, and repeated 
 calls will free 
  more.  Don't speculate about how memory management works: do your 
  homework!
 
 Are you saying that consecutive calls to gc() will free more 
 memory than
 a single call, or am I misunderstanding?   Reading ?gc and 
 ?Memory I don't
 see anything about this mentioned.  Where should I be looking to find 
 more comprehensive info on R's memory management??  I'm not 
 writing any
 packages, just would like to have a better handle on efficiently using
 memory as it is usually the limiting factor with R.  FYI, I'm running
 R1.8.1 and RedHat9 on a P4 with 2GB of RAM in case there is 
 any platform
 specific info that may be applicable.
 
 Thanks,
 
 Doug Grove
 Statistical Research Associate
 Fred Hutchinson Cancer Research Center
 
 
  In any case, you are using an outdated version of R, and your first
  course of action should be to compile up R-devel and try 
 that, as there 
  has been improvements to memory management under Windows.  
 You could also 
  try compiling using the native malloc (and that *is* 
 described in the 
  INSTALL file) as that has different compromises.
  
  
  On Sat, 6 Dec 2003, Richard Pugh wrote:
  
   Hi all,

   I am currently building an application based on R 1.7.1 
 (+ compiled
   C/C++ code + MySql + VB).  I am building this application 
 to work on 2
   different platforms (Windows XP Professional (500mb 
 memory) and Windows
   NT 4.0 with service pack 6 (1gb memory)).  This is a very memory
   intensive application performing sophisticated operations 
 on large
   matrices (typically 5000x1500 matrices).

   I have run into some issues regarding the way R handles 
 its memory,
   especially on NT.  In particular, R does not seem able to 
 recollect some
   of the memory used following the creation and 
 manipulation of large data
   objects.  For example, I have a function which receives a (large)
   numeric matrix, matches against more data (maybe imported 
 from MySql)
   and returns a large list structure for further analysis.  
 A typical call
   may look like this .

myInputData - matrix(sample(1:100, 750, T), nrow=5000)
myPortfolio - createPortfolio(myInputData)

   It seems I can only repeat this code process 2/3 times 
 before I have to
   restart R (to get the memory back).  I use the same object names
   (myInputData and myPortfolio) each time, so I am not 
 create more large
   objects ..

   I think the problems I have are illustrated with the 
 following example
   from a small R session .

# Memory usage for Rui process = 19,800
testData - matrix(rnorm(1000), 1000

RE: [R] Windows Memory Issues

2003-12-09 Thread Benjamin . STABLER
Thanks for the reply.  So are you saying that multiple calls to gc() frees
up memory to Windows and then other processes can use that newly freed
memory?  So multiple calls to gc() does not actually make more memory
available to new R objects that I might create.  The reason I ask is because
I want to know how to use all the available memory that I can to store
object in R.  ?gc says that garbage collection is run without user
intervention so there is really nothing I can do to improve memory under
Windows except increase the --max-mem-size at startup (which is limited to
1.7GB under the current version of R for Windows and will be greater ~3GB
for R 1.9).  Thanks again.

Ben Stabler

-Original Message-
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 09, 2003 9:29 AM
To: STABLER Benjamin
Cc: [EMAIL PROTECTED]
Subject: Re: [R] Windows Memory Issues


On Tue, 9 Dec 2003 [EMAIL PROTECTED] wrote:

 I would also like some clarification about R memory 
management.  Like Doug,
 I didn't find anything about consecutive calls to gc() to 
free more memory.

It was a statement about Windows, and about freeing memory *to 
Windows*.
Douglas Grove apparently had misread both the subject line and the 
sentence.

 We run into memory limit problems every now and then and a better
 understanding of R's memory management would go a long way.  
I am interested
 in learning more and was wondering if there is any specific 
R documentation
 that explains R's memory usage?  Or maybe some good links 
about memory and
 garbage collection.  Thanks.

There are lots of comments in the source files.  And as I already said 
(but has been excised below), this is not relevant to the next 
version of 
R anyway.

BTW, the message below has been selectively edited, so please 
consult the 
original.

 Message: 21
 Date: Mon, 8 Dec 2003 09:51:12 -0800 (PST)
 From: Douglas Grove [EMAIL PROTECTED]
 Subject: Re: [R] Windows Memory Issues
 To: Prof Brian Ripley [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Message-ID:
  [EMAIL PROTECTED]
 Content-Type: TEXT/PLAIN; charset=US-ASCII
 
 On Sat, 6 Dec 2003, Prof Brian Ripley wrote:
 
  I think you misunderstand how R uses memory.  gc() does 
not free up all 
  the memory used for the objects it frees, and repeated 
calls will free 
  more.  Don't speculate about how memory management works: do your 
  homework!
 
 Are you saying that consecutive calls to gc() will free more 
memory than
 a single call, or am I misunderstanding?   Reading ?gc and 
?Memory I don't
 see anything about this mentioned.  Where should I be 
looking to find 
 more comprehensive info on R's memory management??  I'm not 
writing any
 packages, just would like to have a better handle on 
efficiently using
 memory as it is usually the limiting factor with R.  FYI, I'm running
 R1.8.1 and RedHat9 on a P4 with 2GB of RAM in case there is 
any platform
 specific info that may be applicable.
 
 Thanks,
 
 Doug Grove
 Statistical Research Associate
 Fred Hutchinson Cancer Research Center
 
 
  In any case, you are using an outdated version of R, and your first
  course of action should be to compile up R-devel and try 
that, as there 
  has been improvements to memory management under Windows.  
You could also 
  try compiling using the native malloc (and that *is* 
described in the 
  INSTALL file) as that has different compromises.
  
  
  On Sat, 6 Dec 2003, Richard Pugh wrote:
  
   Hi all,

   I am currently building an application based on R 1.7.1 
(+ compiled
   C/C++ code + MySql + VB).  I am building this 
application to work on 2
   different platforms (Windows XP Professional (500mb 
memory) and Windows
   NT 4.0 with service pack 6 (1gb memory)).  This is a very memory
   intensive application performing sophisticated 
operations on large
   matrices (typically 5000x1500 matrices).

   I have run into some issues regarding the way R handles 
its memory,
   especially on NT.  In particular, R does not seem able 
to recollect some
   of the memory used following the creation and 
manipulation of large data
   objects.  For example, I have a function which receives a (large)
   numeric matrix, matches against more data (maybe 
imported from MySql)
   and returns a large list structure for further analysis. 
 A typical call
   may look like this .

myInputData - matrix(sample(1:100, 750, T), nrow=5000)
myPortfolio - createPortfolio(myInputData)

   It seems I can only repeat this code process 2/3 times 
before I have to
   restart R (to get the memory back).  I use the same object names
   (myInputData and myPortfolio) each time, so I am not 
create more large
   objects ..

   I think the problems I have are illustrated with the 
following example
   from a small R session .

# Memory usage for Rui process = 19,800
testData - matrix(rnorm(1000), 1000) # Create big matrix
# Memory usage for Rgui process = 254,550k
rm(testData

RE: [R] Windows Memory Issues

2003-12-09 Thread Thomas Lumley
On Tue, 9 Dec 2003 [EMAIL PROTECTED] wrote:

 Thanks for the reply.  So are you saying that multiple calls to gc() frees
 up memory to Windows and then other processes can use that newly freed
 memory?

No.  You typically can't free memory back to Windows (or many other OSes).


So multiple calls to gc() does not actually make more memory
 available to new R objects that I might create.

Yes and no. It makes more memory available, but only memory that would
have been made available in any case if you had tried to use it.  R calls
the garbage collector before requesting more memory from the operating
system and before running out of memory.


 The reason I ask is because
 I want to know how to use all the available memory that I can to store
 object in R.  ?gc says that garbage collection is run without user
 intervention so there is really nothing I can do to improve memory under
 Windows except increase the --max-mem-size at startup

You can't do anything else to make more memory available, only to use
less.

-thomas

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] Windows Memory Issues

2003-12-09 Thread Prof Brian Ripley
On Tue, 9 Dec 2003 [EMAIL PROTECTED] wrote:

 Thanks for the reply.  So are you saying that multiple calls to gc() frees
 up memory to Windows and then other processes can use that newly freed
 memory?  So multiple calls to gc() does not actually make more memory

That is what I said.  Why do people expect me to repeat myself?

 available to new R objects that I might create.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] Windows Memory Issues

2003-12-09 Thread Prof Brian Ripley
On Tue, 9 Dec 2003, Thomas Lumley wrote:

 On Tue, 9 Dec 2003 [EMAIL PROTECTED] wrote:
 
  Thanks for the reply.  So are you saying that multiple calls to gc() frees
  up memory to Windows and then other processes can use that newly freed
  memory?
 
 No.  You typically can't free memory back to Windows (or many other OSes).

At least using R under Windows NT/2000/XP you can.  I've watched it do so 
whilst fixing memory leaks.

There is another complication here: R for Windows uses a third-party 
malloc, and you can free memory back to that if not to the OS.  The reason 
Windows is special is the issue of fragmentation, which OSes using mmap
(and R-devel under Windows) typically do not suffer.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Windows Memory Issues

2003-12-08 Thread Douglas Grove
On Sat, 6 Dec 2003, Prof Brian Ripley wrote:

 I think you misunderstand how R uses memory.  gc() does not free up all 
 the memory used for the objects it frees, and repeated calls will free 
 more.  Don't speculate about how memory management works: do your 
 homework!

Are you saying that consecutive calls to gc() will free more memory than
a single call, or am I misunderstanding?   Reading ?gc and ?Memory I don't
see anything about this mentioned.  Where should I be looking to find 
more comprehensive info on R's memory management??  I'm not writing any
packages, just would like to have a better handle on efficiently using
memory as it is usually the limiting factor with R.  FYI, I'm running
R1.8.1 and RedHat9 on a P4 with 2GB of RAM in case there is any platform
specific info that may be applicable.

Thanks,

Doug Grove
Statistical Research Associate
Fred Hutchinson Cancer Research Center




 
 In any case, you are using an outdated version of R, and your first
 course of action should be to compile up R-devel and try that, as there 
 has been improvements to memory management under Windows.  You could also 
 try compiling using the native malloc (and that *is* described in the 
 INSTALL file) as that has different compromises.
 
 
 On Sat, 6 Dec 2003, Richard Pugh wrote:
 
  Hi all,
   
  I am currently building an application based on R 1.7.1 (+ compiled
  C/C++ code + MySql + VB).  I am building this application to work on 2
  different platforms (Windows XP Professional (500mb memory) and Windows
  NT 4.0 with service pack 6 (1gb memory)).  This is a very memory
  intensive application performing sophisticated operations on large
  matrices (typically 5000x1500 matrices).
   
  I have run into some issues regarding the way R handles its memory,
  especially on NT.  In particular, R does not seem able to recollect some
  of the memory used following the creation and manipulation of large data
  objects.  For example, I have a function which receives a (large)
  numeric matrix, matches against more data (maybe imported from MySql)
  and returns a large list structure for further analysis.  A typical call
  may look like this .
   
   myInputData - matrix(sample(1:100, 750, T), nrow=5000)
   myPortfolio - createPortfolio(myInputData)
   
  It seems I can only repeat this code process 2/3 times before I have to
  restart R (to get the memory back).  I use the same object names
  (myInputData and myPortfolio) each time, so I am not create more large
  objects ..
   
  I think the problems I have are illustrated with the following example
  from a small R session .
   
   # Memory usage for Rui process = 19,800
   testData - matrix(rnorm(1000), 1000) # Create big matrix
   # Memory usage for Rgui process = 254,550k
   rm(testData)
   # Memory usage for Rgui process = 254,550k
   gc()
   used (Mb) gc trigger  (Mb)
  Ncells 369277  9.9 667722  17.9
  Vcells  87650  0.7   24286664 185.3
   # Memory usage for Rgui process = 20,200k
   
  In the above code, R cannot recollect all memory used, so the memory
  usage increases from 19.8k to 20.2.  However, the following example is
  more typical of the environments I use .
   
   # Memory 128,100k
   myTestData - matrix(rnorm(1000), 1000)
   # Memory 357,272k
   rm(myTestData)
   # Memory 357,272k
   gc()
used (Mb) gc trigger  (Mb)
  Ncells  478197 12.8 818163  21.9
  Vcells 9309525 71.1   31670210 241.7
   # Memory 279,152k
   
  Here, the memory usage increases from 128.1k to 279.1k
   
  Could anyone point out what I could do to rectify this (if anything), or
  generally what strategy I could take to improve this?
   
  Many thanks,
  Rich.
   
  Mango Solutions
  Tel : (01628) 418134
  Mob : (07967) 808091
   
  
  [[alternative HTML version deleted]]
  
  __
  [EMAIL PROTECTED] mailing list
  https://www.stat.math.ethz.ch/mailman/listinfo/r-help
  
  
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] Windows Memory Issues

2003-12-06 Thread Richard Pugh
Hi all,
 
I am currently building an application based on R 1.7.1 (+ compiled
C/C++ code + MySql + VB).  I am building this application to work on 2
different platforms (Windows XP Professional (500mb memory) and Windows
NT 4.0 with service pack 6 (1gb memory)).  This is a very memory
intensive application performing sophisticated operations on large
matrices (typically 5000x1500 matrices).
 
I have run into some issues regarding the way R handles its memory,
especially on NT.  In particular, R does not seem able to recollect some
of the memory used following the creation and manipulation of large data
objects.  For example, I have a function which receives a (large)
numeric matrix, matches against more data (maybe imported from MySql)
and returns a large list structure for further analysis.  A typical call
may look like this .
 
 myInputData - matrix(sample(1:100, 750, T), nrow=5000)
 myPortfolio - createPortfolio(myInputData)
 
It seems I can only repeat this code process 2/3 times before I have to
restart R (to get the memory back).  I use the same object names
(myInputData and myPortfolio) each time, so I am not create more large
objects ..
 
I think the problems I have are illustrated with the following example
from a small R session .
 
 # Memory usage for Rui process = 19,800
 testData - matrix(rnorm(1000), 1000) # Create big matrix
 # Memory usage for Rgui process = 254,550k
 rm(testData)
 # Memory usage for Rgui process = 254,550k
 gc()
 used (Mb) gc trigger  (Mb)
Ncells 369277  9.9 667722  17.9
Vcells  87650  0.7   24286664 185.3
 # Memory usage for Rgui process = 20,200k
 
In the above code, R cannot recollect all memory used, so the memory
usage increases from 19.8k to 20.2.  However, the following example is
more typical of the environments I use .
 
 # Memory 128,100k
 myTestData - matrix(rnorm(1000), 1000)
 # Memory 357,272k
 rm(myTestData)
 # Memory 357,272k
 gc()
  used (Mb) gc trigger  (Mb)
Ncells  478197 12.8 818163  21.9
Vcells 9309525 71.1   31670210 241.7
 # Memory 279,152k
 
Here, the memory usage increases from 128.1k to 279.1k
 
Could anyone point out what I could do to rectify this (if anything), or
generally what strategy I could take to improve this?
 
Many thanks,
Rich.
 
Mango Solutions
Tel : (01628) 418134
Mob : (07967) 808091
 

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Windows Memory Issues

2003-12-06 Thread Prof Brian Ripley
I think you misunderstand how R uses memory.  gc() does not free up all 
the memory used for the objects it frees, and repeated calls will free 
more.  Don't speculate about how memory management works: do your 
homework!

In any case, you are using an outdated version of R, and your first
course of action should be to compile up R-devel and try that, as there 
has been improvements to memory management under Windows.  You could also 
try compiling using the native malloc (and that *is* described in the 
INSTALL file) as that has different compromises.


On Sat, 6 Dec 2003, Richard Pugh wrote:

 Hi all,
  
 I am currently building an application based on R 1.7.1 (+ compiled
 C/C++ code + MySql + VB).  I am building this application to work on 2
 different platforms (Windows XP Professional (500mb memory) and Windows
 NT 4.0 with service pack 6 (1gb memory)).  This is a very memory
 intensive application performing sophisticated operations on large
 matrices (typically 5000x1500 matrices).
  
 I have run into some issues regarding the way R handles its memory,
 especially on NT.  In particular, R does not seem able to recollect some
 of the memory used following the creation and manipulation of large data
 objects.  For example, I have a function which receives a (large)
 numeric matrix, matches against more data (maybe imported from MySql)
 and returns a large list structure for further analysis.  A typical call
 may look like this .
  
  myInputData - matrix(sample(1:100, 750, T), nrow=5000)
  myPortfolio - createPortfolio(myInputData)
  
 It seems I can only repeat this code process 2/3 times before I have to
 restart R (to get the memory back).  I use the same object names
 (myInputData and myPortfolio) each time, so I am not create more large
 objects ..
  
 I think the problems I have are illustrated with the following example
 from a small R session .
  
  # Memory usage for Rui process = 19,800
  testData - matrix(rnorm(1000), 1000) # Create big matrix
  # Memory usage for Rgui process = 254,550k
  rm(testData)
  # Memory usage for Rgui process = 254,550k
  gc()
  used (Mb) gc trigger  (Mb)
 Ncells 369277  9.9 667722  17.9
 Vcells  87650  0.7   24286664 185.3
  # Memory usage for Rgui process = 20,200k
  
 In the above code, R cannot recollect all memory used, so the memory
 usage increases from 19.8k to 20.2.  However, the following example is
 more typical of the environments I use .
  
  # Memory 128,100k
  myTestData - matrix(rnorm(1000), 1000)
  # Memory 357,272k
  rm(myTestData)
  # Memory 357,272k
  gc()
   used (Mb) gc trigger  (Mb)
 Ncells  478197 12.8 818163  21.9
 Vcells 9309525 71.1   31670210 241.7
  # Memory 279,152k
  
 Here, the memory usage increases from 128.1k to 279.1k
  
 Could anyone point out what I could do to rectify this (if anything), or
 generally what strategy I could take to improve this?
  
 Many thanks,
 Rich.
  
 Mango Solutions
 Tel : (01628) 418134
 Mob : (07967) 808091
  
 
   [[alternative HTML version deleted]]
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Windows Memory Issues

2003-12-06 Thread Jason Turner
Richard Pugh wrote:
...
I have run into some issues regarding the way R handles its memory,
especially on NT.  
...

Actually, you've run into NT's nasty memory management.  Welcome! :)
R-core have worked very hard to work around Windows memory issues, so 
they've probably got a better answer than I can give.  I'll give you a 
few quick answers, and then wait for correction when one them replies.

A typical call
may look like this .
 

myInputData - matrix(sample(1:100, 750, T), nrow=5000)
myPortfolio - createPortfolio(myInputData)
 
It seems I can only repeat this code process 2/3 times before I have to
restart R (to get the memory back).  I use the same object names
(myInputData and myPortfolio) each time, so I am not create more large
objects ..
Actually, you do.  Re-using a name does not re-use the same blocks of 
memory.  The size of the object may change, for example.

 
I think the problems I have are illustrated with the following example
from a small R session .
 

# Memory usage for Rui process = 19,800
testData - matrix(rnorm(1000), 1000) # Create big matrix
# Memory usage for Rgui process = 254,550k
rm(testData)
# Memory usage for Rgui process = 254,550k
gc()
 used (Mb) gc trigger  (Mb)
Ncells 369277  9.9 667722  17.9
Vcells  87650  0.7   24286664 185.3
# Memory usage for Rgui process = 20,200k
 
In the above code, R cannot recollect all memory used, so the memory
usage increases from 19.8k to 20.2.  However, the following example is
more typical of the environments I use .
 

# Memory 128,100k
myTestData - matrix(rnorm(1000), 1000)
# Memory 357,272k
rm(myTestData)
# Memory 357,272k
gc()
  used (Mb) gc trigger  (Mb)
Ncells  478197 12.8 818163  21.9
Vcells 9309525 71.1   31670210 241.7
# Memory 279,152k
R can return memory to Windows, but it cannot *make* Windows take it 
back.  Exiting the app is the only guaranteed way to do this, for any 
application.

The fact that you get this with matricies makes me suspect 
fragmentation issues with memory, rather than pure lack of memory. 
Here, the memory is disorganised, thanks to some programmers in Redmond. 
 When a matrix gets assigned, it needs all its memory to be contiguous. 
 If the memory on your machine has, say, 250 MB free, but only in 1 MB 
chunks, and you need to build a 2 MB matrix, you're out of luck.

From the sounds of your calculations, they *must* be done as big 
matricies (true?).  If not, try a data structure that isn't a matrix or 
array; these require *contiguous* blocks of memory.  Lists, by 
comparison, can store their components in separate blocks.  Would a list 
of smaller matricies work?

Could anyone point out what I could do to rectify this (if anything), or
generally what strategy I could take to improve this?
Some suggestions:

1) call gc() somewhere inside your routines regularly.  Not guaranteed 
to help, but worth a try.

2) Get even more RAM, and hope it stabilises.

3) Change data structures to something other than one huge matrix. 
Matricies have huge computational advantages, but are pigs for memory.

4) Export the data crunching part of the application to an operating 
system that isn't notorious for bad memory management.  opinion, 
subjective=yes I've almost never had anguish from Solaris.  Linux and 
FreeBSD are not bad. /opinion Have you considered running the results 
on a different machine, and storing the results in a fresh table on the 
same database as where you get the raw data?

Hope that helps.

Jason
--
Indigo Industrial Controls Ltd.
http://www.indigoindustrial.co.nz
64-21-343-545
[EMAIL PROTECTED]
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help