[R] memory limit

2008-11-26 Thread iwalters

I'm currently working with very large datasets that consist out of 1,000,000
+ rows.  Is it at all possible to use R for datasets this size or should I
rather consider C++/Java.


-- 
View this message in context: 
http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20699700.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory limit

2008-11-26 Thread seanpor

Good afternoon,

The short answer is "yes", the long answer is "it depends".

It all depends on what you want to do with the data, I'm working with
dataframes of a couple of million lines, on this plain desktop machine and
for my purposes it works fine.  I read in text files, manipulate them,
convert them into dataframes, do some basic descriptive stats and tests on
them, a couple of columns at a time, all quick and simple in R.  There are
some libraries which are setup to handle very large datasets, e.g. biglm
[1].

If you're using algorithms which require vast quantities of memory, then as
the previous emails in this thread suggest, you might need R running on
64-bit.

If you're working with a problem which is "embarrassingly parallel"[2], then
there are a variety of solutions - if you're in between then the solutions
are much more data dependant.

the flip question: how long would it take you to get up and running with the
functionallity (tried and tested in R) you require if you're going to be
re-working things in C++?

I suggest that you have a look at R, possibly using a subset of your full
set to start with - you'll be amazed how quickly you can get up and running.

As suggested at the start of this email... "it depends"...

Best Regards,
Sean O'Riordain
Dublin

[1] http://cran.r-project.org/web/packages/biglm/index.html
[2] http://en.wikipedia.org/wiki/Embarrassingly_parallel


iwalters wrote:
> 
> I'm currently working with very large datasets that consist out of
> 1,000,000 + rows.  Is it at all possible to use R for datasets this size
> or should I rather consider C++/Java.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20700590.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory limit

2008-11-26 Thread Stavros Macrakis
I routinely compute with a 2,500,000-row dataset with 16 columns,
which takes 410MB of storage; my Windows box has 4GB, which avoids
thrashing.  As long as I'm careful not to compute and save multiple
copies of the entire data frame (because 32-bit Windows R is limited
to about 1.5GB address space total, including any intermediate
results), R works impressively well and fast with this dataset for
selections, calculations, cross-tabs, plotting, etc.  For example,
simple single-column statistics and cross-tabs take << 1 sec., summary
of the whole thing takes 16 sec. A linear regression between two
numeric columns takes < 20 sec. Plotting of all 2.5M points takes a
while, but that is no surprise (and is usually pointless [sic]
anyway). I have not tried to do any compute-intensive statistical
calculations on the whole data set.

The main (but minor) annoyance with it is that it takes about 90 secs
to load into memory using R's native binary "save" format, so I tend
to keep the process lying around rather than re-starting and
re-loading for each analysis. Fortunately, garbage collection is very
effective in reclaiming unused storage as long as I'm careful to
remove unnecessary objects.

-s


On Wed, Nov 26, 2008 at 7:42 AM, iwalters <[EMAIL PROTECTED]> wrote:
>
> I'm currently working with very large datasets that consist out of 1,000,000
> + rows.  Is it at all possible to use R for datasets this size or should I
> rather consider C++/Java.
>
>
> --
> View this message in context: 
> http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20699700.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory limit

2008-11-26 Thread Henrik Bengtsson
On Wed, Nov 26, 2008 at 1:16 PM, Stavros Macrakis <[EMAIL PROTECTED]> wrote:
> I routinely compute with a 2,500,000-row dataset with 16 columns,
> which takes 410MB of storage; my Windows box has 4GB, which avoids
> thrashing.  As long as I'm careful not to compute and save multiple
> copies of the entire data frame (because 32-bit Windows R is limited
> to about 1.5GB address space total, including any intermediate
> results), R works impressively well and fast with this dataset for
> selections, calculations, cross-tabs, plotting, etc.  For example,
> simple single-column statistics and cross-tabs take << 1 sec., summary
> of the whole thing takes 16 sec. A linear regression between two
> numeric columns takes < 20 sec. Plotting of all 2.5M points takes a
> while, but that is no surprise (and is usually pointless [sic]
> anyway). I have not tried to do any compute-intensive statistical
> calculations on the whole data set.
>
> The main (but minor) annoyance with it is that it takes about 90 secs
> to load into memory using R's native binary "save" format, so I tend
> to keep the process lying around rather than re-starting and
> re-loading for each analysis. Fortunately, garbage collection is very
> effective in reclaiming unused storage as long as I'm careful to
> remove unnecessary objects.

FYI, objects saved with save(..., compress=FALSE) are notable faster
to read back.

/Henrik

>
>-s
>
>
> On Wed, Nov 26, 2008 at 7:42 AM, iwalters <[EMAIL PROTECTED]> wrote:
>>
>> I'm currently working with very large datasets that consist out of 1,000,000
>> + rows.  Is it at all possible to use R for datasets this size or should I
>> rather consider C++/Java.
>>
>>
>> --
>> View this message in context: 
>> http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20699700.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] memory limit problem

2009-09-10 Thread oleg portnoy
Hi,
I have Win XP 32, 4 gig DDR2 and R 2.9.2.
I have memory limit problems.
> memory.limit(4090)
[1] 4090

> memory.limit()
[1] 4090
> a<-trans.matrix.f(7)  # made big matrix of integer 16384*16384
Error: cannot allocate vector of size 512.0 Mb
I not have other objects in R memory.
what I do?
trans.matrix.f <- function(x){
 tr.mat <- matrix(c(0,1,1,0),2)
 for(i in 2:(2*x))
  tr.mat <- rbind(cbind(tr.mat ,tr.mat+1),
  cbind(tr.mat+1,tr.mat ))
 return(tr.mat)
}

Thanks!
Oleg.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] memory limit in R

2009-08-18 Thread Hongwei Dong
Hi, all, I'm doing a discrete choice model in R and kept getting this error:
Error: cannot allocate vector of size 198.6 Mb.

Does this mean the memory limit in R has been reached?

> memory.size()
[1] 1326.89
> memory.size(TRUE)
[1] 1336
> memory.limit()
[1] 1535

My laptop has a 4G memory with Windows Vista (32 bit). I increased the
memory limit to 2500 M. But still getting the same error message. Any one
can give me some suggestions? Thanks.

Harry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory limit problem

2009-09-10 Thread Steve Lianoglou
Hi,

On Thu, Sep 10, 2009 at 8:24 PM, oleg portnoy wrote:
> Hi,
> I have Win XP 32, 4 gig DDR2 and R 2.9.2.
> I have memory limit problems.
>> memory.limit(4090)
> [1] 4090
>
>> memory.limit()
> [1] 4090
>> a<-trans.matrix.f(7)  # made big matrix of integer 16384*16384
> Error: cannot allocate vector of size 512.0 Mb
> I not have other objects in R memory.
> what I do?

Get  a 64 bit system and ditch windows?
http://thread.gmane.org/gmane.comp.lang.r.general/64637

Or maybe the bigmemory package can help?
http://cran.r-project.org/web/packages/bigmemory/

-steve
-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory limit in R

2009-08-18 Thread Uwe Ligges



Hongwei Dong wrote:

Hi, all, I'm doing a discrete choice model in R and kept getting this error:
Error: cannot allocate vector of size 198.6 Mb.

Does this mean the memory limit in R has been reached?


memory.size()

[1] 1326.89

memory.size(TRUE)

[1] 1336

memory.limit()

[1] 1535

My laptop has a 4G memory with Windows Vista (32 bit). I increased the
memory limit to 2500 M. But still getting the same error message. Any one
can give me some suggestions? Thanks.



See ?Memory and find that more than 2047 Mb is not easily possible on 
your 32-bit  (and might not even be sufficient for your problem).


Uwe Ligges



Harry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory limit in R

2009-08-18 Thread Hongwei Dong
The size of my .Rdata workspace is about 9.2 M and the data I'm using is the
only one object in this workspace. Is it a large one?
Thanks.
Harry

On Tue, Aug 18, 2009 at 4:21 AM, jim holtman  wrote:

> About 2GB is the limit of the address space on 32-bit windows (you can
> get upto 3GB with a special flag; check the documentation).  Check the
> size of the other objects your workspace and remove any you don't need
> anymore.  A rule of thumb is that your largest object should only be
> at most 25% of the available space; in your case I would limit it to
> at most 500MB, but try to get by with a smaller object.  Since
> everything is kept in memory, you need to conserve.  You never did say
> what you were doing or how large the objects were you were working on.
>
> If they are large, think about keeping the data in a database and only
> retrieving the portion you need.  Try to run with a smaller sample.
>
> On Mon, Aug 17, 2009 at 2:43 PM, Hongwei Dong wrote:
> > Hi, all, I'm doing a discrete choice model in R and kept getting this
> error:
> > Error: cannot allocate vector of size 198.6 Mb.
> >
> > Does this mean the memory limit in R has been reached?
> >
> >> memory.size()
> > [1] 1326.89
> >> memory.size(TRUE)
> > [1] 1336
> >> memory.limit()
> > [1] 1535
> >
> > My laptop has a 4G memory with Windows Vista (32 bit). I increased the
> > memory limit to 2500 M. But still getting the same error message. Any one
> > can give me some suggestions? Thanks.
> >
> > Harry
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory limit in R

2009-08-18 Thread Jim Holtman
do 'object.size' on all the objects in 'ls()'; also show the output of  
'gc()'

Sent from my iPhone

On Aug 18, 2009, at 13:35, Hongwei Dong  wrote:

> The size of my .Rdata workspace is about 9.2 M and the data I'm  
> using is the only one object in this workspace. Is it a large one?
>
> Thanks.
> Harry
>
> On Tue, Aug 18, 2009 at 4:21 AM, jim holtman   
> wrote:
> About 2GB is the limit of the address space on 32-bit windows (you can
> get upto 3GB with a special flag; check the documentation).  Check the
> size of the other objects your workspace and remove any you don't need
> anymore.  A rule of thumb is that your largest object should only be
> at most 25% of the available space; in your case I would limit it to
> at most 500MB, but try to get by with a smaller object.  Since
> everything is kept in memory, you need to conserve.  You never did say
> what you were doing or how large the objects were you were working on.
>
> If they are large, think about keeping the data in a database and only
> retrieving the portion you need.  Try to run with a smaller sample.
>
> On Mon, Aug 17, 2009 at 2:43 PM, Hongwei Dong  
> wrote:
> > Hi, all, I'm doing a discrete choice model in R and kept getting  
> this error:
> > Error: cannot allocate vector of size 198.6 Mb.
> >
> > Does this mean the memory limit in R has been reached?
> >
> >> memory.size()
> > [1] 1326.89
> >> memory.size(TRUE)
> > [1] 1336
> >> memory.limit()
> > [1] 1535
> >
> > My laptop has a 4G memory with Windows Vista (32 bit). I increased  
> the
> > memory limit to 2500 M. But still getting the same error message.  
> Any one
> > can give me some suggestions? Thanks.
> >
> > Harry
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.