[R] Reshape large Data Frame to new format
Hi R-experts, I have a data.frame that I want to reshape to a certain format so I can use it in a tool for further analysis. Basicly I have a very long list with IDs of persons and their codes. I create a row for every person with 25 of their codes. I a person has more then 25 codes, I want to add another row for that person. If a row contains less then 25 codes I want to fill with empty string values. I have manually created a sample rawData and resultData and used dput so you can see my starting DF and the wanted result DF. The sample is of very limited size, the real data would contain a few million(!) records. rawData - structure(list(PersonID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c(Person1, Person2, Person3, Person4, Person5), class = factor), codes = c(34396L, 81878L, 67829L, 13428L, 12992L, 63724L, 85930L, 78497L, 59578L, 50733L, 26154L, 47205L, 74578L, 12204L, 42435L, 96643L, 35242L, 29836L, 73031L, 11326L, 96686L, 55849L, 56415L, 11064L, 78509L, 55715L, 75851L, 60682L, 16277L, 52763L, 23429L, 39723L, 95809L, 60081L, 19618L, 46012L, 79188L, 54664L, 64420L, 72875L, 97428L, 74897L, 75615L, 12023L, 21572L, 56177L, 61704L, 70879L, 69033L, 87224L, 68670L, 65602L, 25476L, 81209L, 62086L, 35492L, 39771L, 14380L, 43858L, 53679L, 78023L, 43785L, 69884L, 12840L, 54021L, 68002L, 79249L, 61784L, 7L, 28935L, 91406L, 42045L, 97716L, 65690L, 57310L, 57627L, 32227L, 43121L, 22251L, 31255L, 90660L, 89118L, 14558L, 99824L, 25005L, 62186L, 10527L, 99438L, 85656L, 79465L, 35357L, 41697L, 83084L, 83590L, 16234L, 32480L, 50991L, 79524L, 93888L, 32637L, 13253L, 76576L, 48632L, 68014L, 24281L, 74320L, 44601L, 36251L, 27825L, 85569L, 21634L, 50364L, 74436L, 73216L, 89342L, 63562L, 88485L, 40552L, 49359L, 29636L, 26285L, 13263L, 18106L, 78589L, 43479L, 12491L, 50840L, 77453L, 80578L, 43693L, 89857L, 12837L, 55950L, 63049L, 84508L, 29736L, 88194L, 86849L, 54274L, 38713L)), .Names = c(PersonID, codes), row.names = c(NA, -140L), class = data.frame) resultData = structure(list(PersonId = c(Person1, Person1, Person2, Person3, Person4, Person5, Person5, Person5), Code1 = c(34396, 55715, 97428, 56177, 68002, 90660, 74320, 89857), Code2 = c(81878, 75851, 74897, 61704, 79249, 89118, 44601, 12837), Code3 = c(67829, 60682, 75615, 70879, 61784, 14558, 36251, 55950), Code4 = c(13428, 16277, 12023, 69033, 7, 99824, 27825, 63049), Code5 = c(12992, 52763, 21572, 87224, 28935, 25005, 85569, 84508), Code6 = c(63724, 23429, , 68670, 91406, 62186, 21634, 29736), Code7 = c(85930, 39723, , 65602, 42045, 10527, 50364, 88194), Code8 = c(78497, 95809, , 25476, 97716, 99438, 74436, 86849), Code9 = c(59578, 60081, , 81209, 65690, 85656, 73216, 54274), Code10 = c(50733, 19618, , 62086, 57310, 79465, 89342, 38713), Code11 = c(26154, 46012, , 35492, 57627, 35357, 63562, ), Code12 = c(47205, 79188, , 39771, 32227, 41697, 88485, ), Code13 = c(74578, 54664, , 14380, 43121, 83084, 40552, ), Code14 = c(12204, 64420, , 43858, 22251, 83590, 49359, ), Code15 = c(42435, 72875, , 53679, 31255, 16234, 29636, ), Code16 = c(96643, , , , 78023, 32480, 26285, ), Code17 = c(35242, , , , 43785, 50991, 13263, ), Code18 = c(29836, , , , 69884, 79524, 18106, ), Code19 = c(73031, , , , 12840, 93888, 78589, ), Code20 = c(11326, , , , 54021, 32637, 43479, ), Code21 = c(96686, , , , , 13253, 12491, ), Code22 = c(55849, , , , , 76576, 50840, ), Code23 = c(56415, , , , , 48632, 77453, ), Code24 = c(11064, , , , , 68014, 80578, ), Code25 = c(78509, , , , , 24281, 43693, )), .Names = c(PersonId, Code1, Code2, Code3, Code4, Code5, Code6, Code7, Code8, Code9, Code10, Code11, Code12, Code13, Code14, Code15, Code16, Code17, Code18, Code19, Code20, Code21, Code22, Code23, Code24, Code25), row.names = c(NA, -8L), class = data.frame) This sample data explains very well what I'm trying to achieve. As you can see there are 2 rows for Person1 and 3 rows for Person 5 because they have respectively 40 and 60 codes. I'm a big fan of the data.table package so maybe someone has an solution using that package? But of course any solution is welcome:-) Thanks for any help in advance, Regards Dark -- View this message in context: http://r.789695.n4.nabble.com/Reshape-large-Data-Frame-to-new-format-tp4687431.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman
Re: [R] Create rows for columns in dataframe
Hi A.K, Thanks for your great help. I'm now running your first suggestion on a 600.000 row sample after verifying it works on a smaller sample. It's now been running for 40 minutes. Which method do you think will be faster? Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673704.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
Hi Arun, The second method is indeed working much faster. It worked fast for my 600.000 row record. Still I have 2 bigger files where processing becomes an issue even though I have lots of memory (32 gig) for the second statement: res2-reshape(dat2,idvar=newCol,varying=list(2:26),direction=long) Would data.table also take less memory? Maybe even speed things up would be good. How would I do it? I think splitting the dataframe before merging it might also be an option and after that combining them, any ideas on that? Regards Dirk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673750.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create rows for columns in dataframe
Hi experts, I have a dataframe with 100k+ records. it has a key/id column and 25 code columns. I would like to restructure it having a row for each code column. I have a structure like this (used dput): structure(list(DSYSRTKY = structure(c(1L, 2L, 3L, 3L, 4L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(10005, 10203, 10315, 10327), class = factor), C1 = structure(c(6L, 3L, 2L, 5L, 1L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(41401, 42831, 45341, 486, 5990, 71535), class = factor), C2 = structure(c(5L, 1L, 3L, 6L, 4L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(4019, 51881, 5990, 6826, 78900, V4986), class = factor), C3 = structure(c(6L, 3L, 5L, 2L, 4L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(5119, 5939, 72400, 7850, 8052, V1251), class = factor), C4 = structure(c(6L, 5L, 3L, 1L, 2L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(3109, 4019, 4241, 42789, V1011, V454), class = factor), C5 = structure(c(1L, 1L, 3L, 1L, 2L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2720, 4019, 7823), class = factor), C6 = structure(c(1L, 1L, 2L, 1L, 4L, 3L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 311, 41400, 49390), class = factor), C7 = structure(c(1L, 1L, 2L, 1L, 3L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2724, 2859, V4581), class = factor), C8 = structure(c(1L, 1L, 3L, 1L, 4L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 40390, 71680, 79029), class = factor), C9 = structure(c(1L, 1L, 2L, 1L, 4L, 3L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 4168, 5859, V1582), class = factor), C10 = structure(c(1L, 1L, 3L, 1L, 1L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 49390, 7804), class = factor), C11 = structure(c(1L, 1L, 3L, 1L, 1L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2724, V066), class = factor), C12 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 6930), class = factor), C13 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 41400), class = factor), C14 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, V4581), class = factor), C15 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 40291), class = factor), C16 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 4280), class = factor), C17 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C18 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C19 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C20 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C21 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C22 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C23 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C24 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C25 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor)), .Names = c(DSYSRTKY, C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21, C22, C23, C24, C25), row.names = c(1, 2, 3, 4, 5, 6), class = data.frame) Now I want to restructure this dataframe not having 25 code fields but a row for each code but only if the code has a value! The new structure should look something like: NewDataFrame - data.frame(ID=integer(), DSYSRTKY=integer(), CODE=character(), PRIMAIRY=logical()) The ID column should just be an increment. PRIMAIRY is a boolean which should be true if orriginally was the first code (C1). It has to be efficient since my real data has many more rows than my example structure of only 6 rows. I tried some looping mechanism and it was working but it was not performing at all. Hopefully I provided enough information using dput. Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
Hi, My desired output for my sample!! using dput(): structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48), DSYSRTKY = c(10005, 10005, 10005, 10005, 10203, 10203, 10203, 10203, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327), CODE = c(71535, 78900, V1251, V454, 45341, 4019, 72400, V1011, 42831, 5990, 8052, 4241, 4019, 311, 2724, 71680, 4168, 7804, V066, 6930, 41400, V4581, 40291, 4280, 5990, V4986, 5939, 3109, 41401, 6826, 7850, 4019, 2720, 49390, 2859, 79029, V1582, 486, 51881, 5119, 42789, 7823, 41400, V4581, 40390, 5859, 49390, 2724), PRIMAIRY = c(TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)), .Names = c(ID, DSYSRTKY, CODE, PRIMAIRY), row.names = c(NA, 48L), class = data.frame) So the 'DSYSRTKY' (10005) has 4 code fields filled so you get 4 rows. The next one also 4, the third one 16. Anyway, just take a look at the sample. I think this will help trying to make clear what my desired result is! Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673646.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Add column to dataframe based on code in other column
Hi all, I have a dataframe of users which contain US-state codes. Now I want to add a column named REGION based on the state code. I have already done a mapping: NorthEast - c(07, 20, 22, 30, 31, 33, 39, 41, 47) MidWest - c(14, 15, 16, 17, 23, 24, 26, 28, 35, 36, 43, 52) South - c(01, 04, 08, 09, 10, 11, 18, 19, 21, 25, 34, 37, 42, 44, 45, 49, 51) West - c(02, 03, 05, 06, 12, 13, 27, 29, 32, 38, 46, 50, 53) Other - c(40, 48, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 94, 98, 99) So for example: NameState_Code Tom 20 Harry 56 Ben 05 Sally 04 Should become like: So for example: NameState_Code REGION Tom 20 NorthEast Harry 56 Other Ben 05 West Sally 04 South Could anyone help me with a clever statement? -- View this message in context: http://r.789695.n4.nabble.com/Add-column-to-dataframe-based-on-code-in-other-column-tp4673335.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Retreiving correct data from combining two datasets
Hi all,I have two datasets:*Dataset 1 - List of Users:*ID Name C1 C2 C3 C23 C24 C25*Dataset 2 - List of Codes*Code Description CategoryThe code fields in the user-dataset do not have to contain a value and if they have a value they dont have to correspond with the Codes-dataset.Now I need 2 things:- Per user-record the number of occurences in the Codes table, so if one user would have C1, C8, C12, C19 occuring in the Codes dataset that would be 4.- The top 20 of the most occuring codes with their descriptionI find this very challenging but I'm sure there are some R-guru's out there who can help me on this:)Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/Retreiving-correct-data-from-combining-two-datasets-tp4673098.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Retreiving correct data from combining two datasets
Hi all, I have two datasets: Dataset 1 - List of Users, the layout looks like this: ID Name C1 C2 C3 C23 C24 C25 Dataset 2 - List of Codes, the layout looks like this: Code Description Category The code fields in the user-dataset do not have to contain a value and if they have a value they dont have to correspond with the Codes-dataset. Now I need 2 things: - Per user-record the number of occurences in the Codes table, so if one user would have C1, C8, C12, C19 occuring in the Codes dataset that would be 4. - The top 20 of the most occuring codes with their description I find this very challenging but I'm sure there are some R-guru's out there who can help me on this:) I cannot give any sample data because it's classified Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/Retreiving-correct-data-from-combining-two-datasets-tp4673131.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Add a column to a data frame with value based on the percentile of the row
Hi all, I think this should be an easy question for the guru's out here. I have this large data frame (2.500.000 rows, 15 columns) and I want to add a column named SEGMENT to it. The first 5% rows (first 125.000 rows) should have the value Top 5% in the SEGMENT column Then the rows from 5% to 20% should have the value 5 to 20 Then 20-50% should have the value 20 to 50 And the last 50% of the rows should have the value Bottom 50 What is the easiest way of doing this? I was thinking of using quantile but then I should have some rownumber column. Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Add-a-column-to-a-data-frame-with-value-based-on-the-percentile-of-the-row-tp4672711.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add a column to a data frame with value based on the percentile of the row
Works like a charm, thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Add-a-column-to-a-data-frame-with-value-based-on-the-percentile-of-the-row-tp4672711p4672728.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add a column to a data frame with value based on the percentile of the row
Hi Arun Kirshna, I have tested your method and it will work for me. I only run into one problem. Before I want to do this operation I have sorted my data frame so my rownumbers ar not subsequent. You can see if you first order your example data frame like: dat1 - dat1[order(-dat1$value),] head(dat1) IDvalue SEGMENT 237 237 3.538552 20 to 50 21 21 3.376149Top 5% 421 421 3.015634 Bottom 50 339 339 2.855991 Bottom 50 119 119 2.589574 20 to 50 12 12 2.512276Top 5% Do you have a solution for this? -- View this message in context: http://r.789695.n4.nabble.com/Add-a-column-to-a-data-frame-with-value-based-on-the-percentile-of-the-row-tp4672711p4672726.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving multiple rda-files as one rda-file
Really no one has any suggestions on this issue? -- View this message in context: http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041p4672278.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving multiple rda-files as one rda-file
Hi, Yes maybe I should have been more clear on my problem. I want to append the different data-frames back into one variable ( rbind ) and save it as one R Data file. Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041p4672313.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Saving multiple rda-files as one rda-file
Hi all, For a project we have to process some very large CSV files (up to 40 gig) To reduce them in size and increase operating performance I wanted to store them as RData files. Since it was to big I decided to split the csv and saving those parts as separate .RDA files. So far so good. Now I want to bind them all together to save as one RDA file again and this is supprisingly difficult. First I load my rda files into my environment: load(paste(rdaoutputdir, file1.rda, sep=)) load(paste(rdaoutputdir, file2.rda, sep=)) load(paste(rdaoutputdir, file3.rda, sep=)) etc Then I try to combine them into one object. Using rbind like this gives memory allocation problems ('Error: cannot allocate vector of size') objectToSave - rbind(object1, object2, object3) using pre-allocation gives me a factor level error. I used this code: nextrow - nrow(object1)+1 object1[nextrow:(nextrow+nrow(object2)-1),] - object2 # we need to assure unique row names row.names(object1) = 1:nrow(object1) rm(object2) gc() 15! warning messages: 1: In `[-.factor`(`*tmp*`, iseq, value = structure(c(1L, ... : invalid factor level, NA generated 2: In `[-.factor`(`*tmp*`, iseq, value = structure(c(1L, ... : invalid factor level, NA generated What can I do? Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.