Re: [R] Saving multiple rda-files as one rda-file

2013-07-26 Thread David Winsemius

On Jul 25, 2013, at 7:17 AM, Dark wrote:

 Hi, 
 
 Yes maybe I should have been more clear on my problem.
 I want to append the different data-frames back into one variable ( rbind )
 and save it as one R Data file.
 

Indeed. That was the operation I had in mind when I made my suggestions. 
Perhaps you need to create a set of toy dataframes with similar structure and 
then the audience can propose solutions. That's the usual process around these 
parts.

-- 
David.


 Regards Derk
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041p4672313.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Saving multiple rda-files as one rda-file

2013-07-26 Thread jim holtman
What you need to ensure is that you have sufficient physical memory for the
operations that you want to do.  I would suggest at least 3X the size of
the object you want to create.  If you have RData files that will result
(after the rbind) into a 40GB object, then you will need over 100GB of
physical memory since the rbind operation will be creating a new object of
40GB from the separate files that total 40GB, so that is 80GB right there.
 If you then want to do some operations on an object that large, you might
be making copies, so you need the memory.

Maybe you should consider keeping the data in a relational database and
then use the SELECT operator to get just the data you need.  Also the
aggregation operators might be useful to reduce the size of physical memory
that you need.


On Fri, Jul 26, 2013 at 11:04 AM, David Winsemius dwinsem...@comcast.netwrote:


 On Jul 25, 2013, at 7:17 AM, Dark wrote:

  Hi,
 
  Yes maybe I should have been more clear on my problem.
  I want to append the different data-frames back into one variable (
 rbind )
  and save it as one R Data file.
 

 Indeed. That was the operation I had in mind when I made my suggestions.
 Perhaps you need to create a set of toy dataframes with similar structure
 and then the audience can propose solutions. That's the usual process
 around these parts.

 --
 David.


  Regards Derk
 
 
 
  --
  View this message in context:
 http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041p4672313.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 David Winsemius
 Alameda, CA, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Saving multiple rda-files as one rda-file

2013-07-25 Thread Dark
Really no one has any suggestions on this issue?



--
View this message in context: 
http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041p4672278.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Saving multiple rda-files as one rda-file

2013-07-25 Thread PIKAL Petr
Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Dark
 Sent: Thursday, July 25, 2013 11:00 AM
 To: r-help@r-project.org
 Subject: Re: [R] Saving multiple rda-files as one rda-file
 
 Really no one has any suggestions on this issue?

What issue? AFAIK you can load any number of RDA files to your workspace and 
save your workspace as one file. I do not see any problem.

Regards
Petr

 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Saving-
 multiple-rda-files-as-one-rda-file-tp4672041p4672278.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Saving multiple rda-files as one rda-file

2013-07-25 Thread David Winsemius

On Jul 22, 2013, at 4:18 AM, Dark wrote:

 Hi all,
 
 For a project we have to process some very large CSV files (up to 40 gig)
 To reduce them in size and increase operating performance I wanted to store
 them as RData files.
 Since it was to big I decided to split the csv and saving those parts as
 separate .RDA files.
 So far so good. Now I want to bind them all together to save as one RDA file
 again and this is supprisingly difficult.
 
 First I load my rda files into my environment:
 load(paste(rdaoutputdir, file1.rda, sep=))
 load(paste(rdaoutputdir, file2.rda, sep=))
 load(paste(rdaoutputdir, file3.rda, sep=))
 etc
 
 Then I try to combine them into one object.
 
 Using rbind like this gives memory allocation problems ('Error: cannot
 allocate vector of size')
 objectToSave - rbind(object1, object2, object3)
 
 using pre-allocation gives me a factor level error. I used this code:
   nextrow - nrow(object1)+1
   object1[nextrow:(nextrow+nrow(object2)-1),] - object2
   # we need to assure unique row names
row.names(object1) = 1:nrow(object1)
   rm(object2)
gc()
 
 15! warning messages:
 1: In `[-.factor`(`*tmp*`, iseq, value = structure(c(1L,  ... :
  invalid factor level, NA generated
 2: In `[-.factor`(`*tmp*`, iseq, value = structure(c(1L,  ... :
  invalid factor level, NA generated
 

The warning messages suggests that the factor levels in object1, object2, 
object3 in corresponding columns are not the same.

 What can I do?

You can identify which columns are factors and make the corresponding columns 
have levels that span the values.

OR:

Depending on the contents of that factor you could convert to character before 
the rbind operation. If the levels are not particularly long (in character 
length), that procedure might not expand the memory footprint very much.

-- 
David
 
 Regards Derk
 
 


David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Saving multiple rda-files as one rda-file

2013-07-25 Thread Dark
Hi, 

Yes maybe I should have been more clear on my problem.
I want to append the different data-frames back into one variable ( rbind )
and save it as one R Data file.

Regards Derk



--
View this message in context: 
http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041p4672313.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Saving multiple rda-files as one rda-file

2013-07-22 Thread Dark
Hi all,

For a project we have to process some very large CSV files (up to 40 gig)
To reduce them in size and increase operating performance I wanted to store
them as RData files.
Since it was to big I decided to split the csv and saving those parts as
separate .RDA files.
So far so good. Now I want to bind them all together to save as one RDA file
again and this is supprisingly difficult.

First I load my rda files into my environment:
load(paste(rdaoutputdir, file1.rda, sep=))
load(paste(rdaoutputdir, file2.rda, sep=))
load(paste(rdaoutputdir, file3.rda, sep=))
etc

Then I try to combine them into one object.

Using rbind like this gives memory allocation problems ('Error: cannot
allocate vector of size')
objectToSave - rbind(object1, object2, object3)

using pre-allocation gives me a factor level error. I used this code:
nextrow - nrow(object1)+1
object1[nextrow:(nextrow+nrow(object2)-1),] - object2
# we need to assure unique row names
row.names(object1) = 1:nrow(object1)
rm(object2)
gc()

15! warning messages:
1: In `[-.factor`(`*tmp*`, iseq, value = structure(c(1L,  ... :
  invalid factor level, NA generated
2: In `[-.factor`(`*tmp*`, iseq, value = structure(c(1L,  ... :
  invalid factor level, NA generated

What can I do?

Regards Derk



--
View this message in context: 
http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.