Good afternoon,

The short answer is "yes", the long answer is "it depends".

It all depends on what you want to do with the data, I'm working with
dataframes of a couple of million lines, on this plain desktop machine and
for my purposes it works fine.  I read in text files, manipulate them,
convert them into dataframes, do some basic descriptive stats and tests on
them, a couple of columns at a time, all quick and simple in R.  There are
some libraries which are setup to handle very large datasets, e.g. biglm
[1].

If you're using algorithms which require vast quantities of memory, then as
the previous emails in this thread suggest, you might need R running on
64-bit.

If you're working with a problem which is "embarrassingly parallel"[2], then
there are a variety of solutions - if you're in between then the solutions
are much more data dependant.

the flip question: how long would it take you to get up and running with the
functionallity (tried and tested in R) you require if you're going to be
re-working things in C++?

I suggest that you have a look at R, possibly using a subset of your full
set to start with - you'll be amazed how quickly you can get up and running.

As suggested at the start of this email... "it depends"...

Best Regards,
Sean O'Riordain
Dublin

[1] http://cran.r-project.org/web/packages/biglm/index.html
[2] http://en.wikipedia.org/wiki/Embarrassingly_parallel


iwalters wrote:
> 
> I'm currently working with very large datasets that consist out of
> 1,000,000 + rows.  Is it at all possible to use R for datasets this size
> or should I rather consider C++/Java.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20700590.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to