Re: [R] limits of a data frame size for reading into R
I sometimes have to work with vectors/matrices with > 2^31 - 1 elements. I have found the bigmemory package to be of great help. My lab is also going to learn sqldf package for getting bits of big data into/out of R. Learning both of those packages should help you work with large datasets in R. That said, I still hold out hope that someday, the powers that be - or some hotshot operation like R+ or Revolutions - will see that increasing numbers of users will routinely need to access > 2^31-1 elements, and that the packages above are a band-aid on a deeper issue: using such large datasets with ease in R. As of now, it remains quite awkward. Matt On Tue, Aug 3, 2010 at 12:32 PM, Duncan Murdoch wrote: > On 03/08/2010 2:28 PM, Dimitri Liakhovitski wrote: >> >> And once one above the limit that Jim indicated - is there anything one >> can do? >> > > Yes, there are several packages for handling datasets that are too big to > fit in memory: biglm, ff, etc. You need to change your code to work with > them, so it's a lot of work to do something unusual, but there are > possibilities. > > Duncan Murdoch > >> Thank you! >> Dimitri >> >> >> On Tue, Aug 3, 2010 at 2:12 PM, Dimitri Liakhovitski >> wrote: >> > Thanks a lot, it's very helpful! >> > Dimitri >> > >> > On Tue, Aug 3, 2010 at 1:53 PM, Duncan Murdoch >> > wrote: >> >> On 03/08/2010 1:10 PM, Dimitri Liakhovitski wrote: >> >>> >> >>> I understand the question I am about to ask is rather vague and >> >>> depends on the task and my PC memory. However, I'll give it a try: >> >>> >> >>> Let's assume the goal is just to read in the data frame into R and >> >>> then do some simple analyses with it (e.g., multiple regression of >> >>> some variables onto some - just a few - variables). >> >>> >> >>> Is there a limit to the number of columns of a data frame that R can >> >>> handle? I am asking because where I work many use SAS and they are >> >>> running into the limit of >~13,700columns there. >> >>> >> >>> Since I am asking - is there a limit to the number of rows? >> >>> >> >>> Or is the correct way of asking the question: my PC's memory is X. The >> >>> .txt tab-delimited file I am trying to read in has the size of YYY Mb, >> >>> can I read it in? >> >>> >> >> >> >> Besides what Jim said, there is a 2^31-1 limit on the number of >> >> elements in >> >> a vector. Dataframes are vectors of vectors, so you can have at most >> >> 2^31-1 >> >> rows and 2^31-1 columns. Matrices are vectors, so they're limited to >> >> 2^31-1 >> >> elements in total. >> >> This is only likely to be a limitation on a 64 bit machine; in 32 bits >> >> you'll run out of memory first. >> >> >> >> Duncan Murdoch >> >> >> > >> > >> > >> > -- >> > Dimitri Liakhovitski >> > Ninah Consulting >> > www.ninah.com >> > >> >> >> >> > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] limits of a data frame size for reading into R
On 03/08/2010 2:28 PM, Dimitri Liakhovitski wrote: And once one above the limit that Jim indicated - is there anything one can do? Yes, there are several packages for handling datasets that are too big to fit in memory: biglm, ff, etc. You need to change your code to work with them, so it's a lot of work to do something unusual, but there are possibilities. Duncan Murdoch Thank you! Dimitri On Tue, Aug 3, 2010 at 2:12 PM, Dimitri Liakhovitski wrote: > Thanks a lot, it's very helpful! > Dimitri > > On Tue, Aug 3, 2010 at 1:53 PM, Duncan Murdoch wrote: >> On 03/08/2010 1:10 PM, Dimitri Liakhovitski wrote: >>> >>> I understand the question I am about to ask is rather vague and >>> depends on the task and my PC memory. However, I'll give it a try: >>> >>> Let's assume the goal is just to read in the data frame into R and >>> then do some simple analyses with it (e.g., multiple regression of >>> some variables onto some - just a few - variables). >>> >>> Is there a limit to the number of columns of a data frame that R can >>> handle? I am asking because where I work many use SAS and they are >>> running into the limit of >~13,700columns there. >>> >>> Since I am asking - is there a limit to the number of rows? >>> >>> Or is the correct way of asking the question: my PC's memory is X. The >>> .txt tab-delimited file I am trying to read in has the size of YYY Mb, >>> can I read it in? >>> >> >> Besides what Jim said, there is a 2^31-1 limit on the number of elements in >> a vector. Dataframes are vectors of vectors, so you can have at most 2^31-1 >> rows and 2^31-1 columns. Matrices are vectors, so they're limited to 2^31-1 >> elements in total. >> This is only likely to be a limitation on a 64 bit machine; in 32 bits >> you'll run out of memory first. >> >> Duncan Murdoch >> > > > > -- > Dimitri Liakhovitski > Ninah Consulting > www.ninah.com > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] limits of a data frame size for reading into R
And once one above the limit that Jim indicated - is there anything one can do? Thank you! Dimitri On Tue, Aug 3, 2010 at 2:12 PM, Dimitri Liakhovitski wrote: > Thanks a lot, it's very helpful! > Dimitri > > On Tue, Aug 3, 2010 at 1:53 PM, Duncan Murdoch > wrote: >> On 03/08/2010 1:10 PM, Dimitri Liakhovitski wrote: >>> >>> I understand the question I am about to ask is rather vague and >>> depends on the task and my PC memory. However, I'll give it a try: >>> >>> Let's assume the goal is just to read in the data frame into R and >>> then do some simple analyses with it (e.g., multiple regression of >>> some variables onto some - just a few - variables). >>> >>> Is there a limit to the number of columns of a data frame that R can >>> handle? I am asking because where I work many use SAS and they are >>> running into the limit of >~13,700columns there. >>> >>> Since I am asking - is there a limit to the number of rows? >>> >>> Or is the correct way of asking the question: my PC's memory is X. The >>> .txt tab-delimited file I am trying to read in has the size of YYY Mb, >>> can I read it in? >>> >> >> Besides what Jim said, there is a 2^31-1 limit on the number of elements in >> a vector. Dataframes are vectors of vectors, so you can have at most 2^31-1 >> rows and 2^31-1 columns. Matrices are vectors, so they're limited to 2^31-1 >> elements in total. >> This is only likely to be a limitation on a 64 bit machine; in 32 bits >> you'll run out of memory first. >> >> Duncan Murdoch >> > > > > -- > Dimitri Liakhovitski > Ninah Consulting > www.ninah.com > -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] limits of a data frame size for reading into R
Thanks a lot, it's very helpful! Dimitri On Tue, Aug 3, 2010 at 1:53 PM, Duncan Murdoch wrote: > On 03/08/2010 1:10 PM, Dimitri Liakhovitski wrote: >> >> I understand the question I am about to ask is rather vague and >> depends on the task and my PC memory. However, I'll give it a try: >> >> Let's assume the goal is just to read in the data frame into R and >> then do some simple analyses with it (e.g., multiple regression of >> some variables onto some - just a few - variables). >> >> Is there a limit to the number of columns of a data frame that R can >> handle? I am asking because where I work many use SAS and they are >> running into the limit of >~13,700columns there. >> >> Since I am asking - is there a limit to the number of rows? >> >> Or is the correct way of asking the question: my PC's memory is X. The >> .txt tab-delimited file I am trying to read in has the size of YYY Mb, >> can I read it in? >> > > Besides what Jim said, there is a 2^31-1 limit on the number of elements in > a vector. Dataframes are vectors of vectors, so you can have at most 2^31-1 > rows and 2^31-1 columns. Matrices are vectors, so they're limited to 2^31-1 > elements in total. > This is only likely to be a limitation on a 64 bit machine; in 32 bits > you'll run out of memory first. > > Duncan Murdoch > -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] limits of a data frame size for reading into R
On 03/08/2010 1:10 PM, Dimitri Liakhovitski wrote: I understand the question I am about to ask is rather vague and depends on the task and my PC memory. However, I'll give it a try: Let's assume the goal is just to read in the data frame into R and then do some simple analyses with it (e.g., multiple regression of some variables onto some - just a few - variables). Is there a limit to the number of columns of a data frame that R can handle? I am asking because where I work many use SAS and they are running into the limit of >~13,700columns there. Since I am asking - is there a limit to the number of rows? Or is the correct way of asking the question: my PC's memory is X. The .txt tab-delimited file I am trying to read in has the size of YYY Mb, can I read it in? Besides what Jim said, there is a 2^31-1 limit on the number of elements in a vector. Dataframes are vectors of vectors, so you can have at most 2^31-1 rows and 2^31-1 columns. Matrices are vectors, so they're limited to 2^31-1 elements in total. This is only likely to be a limitation on a 64 bit machine; in 32 bits you'll run out of memory first. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] limits of a data frame size for reading into R
You probably don't want an object that is larger than about 25% of the physical memory so that copies can be made during some processing. If you are running on a 32-bit system which will limit you to at most 3GB of memory, then your largest object should not be greater than 800MB. If you want to have 13,700 columns of numeric data (takes 8 bytes per element), then each row would require about 100KB and that would mean you would probably have an object with about 8000 rows. 64-bit is probably limited by how much you want to spend for memory. On Tue, Aug 3, 2010 at 1:10 PM, Dimitri Liakhovitski wrote: > I understand the question I am about to ask is rather vague and > depends on the task and my PC memory. However, I'll give it a try: > > Let's assume the goal is just to read in the data frame into R and > then do some simple analyses with it (e.g., multiple regression of > some variables onto some - just a few - variables). > > Is there a limit to the number of columns of a data frame that R can > handle? I am asking because where I work many use SAS and they are > running into the limit of >~13,700columns there. > > Since I am asking - is there a limit to the number of rows? > > Or is the correct way of asking the question: my PC's memory is X. The > .txt tab-delimited file I am trying to read in has the size of YYY Mb, > can I read it in? > > Thanks a lot! > > -- > Dimitri Liakhovitski > Ninah Consulting > www.ninah.com > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] limits of a data frame size for reading into R
I understand the question I am about to ask is rather vague and depends on the task and my PC memory. However, I'll give it a try: Let's assume the goal is just to read in the data frame into R and then do some simple analyses with it (e.g., multiple regression of some variables onto some - just a few - variables). Is there a limit to the number of columns of a data frame that R can handle? I am asking because where I work many use SAS and they are running into the limit of >~13,700columns there. Since I am asking - is there a limit to the number of rows? Or is the correct way of asking the question: my PC's memory is X. The .txt tab-delimited file I am trying to read in has the size of YYY Mb, can I read it in? Thanks a lot! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.