Re: [R] Create rows for columns in dataframe
Hi A.K, Thanks for your great help. I'm now running your first suggestion on a 600.000 row sample after verifying it works on a smaller sample. It's now been running for 40 minutes. Which method do you think will be faster? Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673704.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
Hi, I tried the second method on a bigger dataset. This is what I get, indx-rep(1:nrow(dat1),6e4) dat2- dat1[indx,] system.time({ vec1- paste(dat2[,1],dat2[,2],colnames(dat2)[2],sep=.) res2-reshape(dat2,idvar=newCol,varying=list(2:26),direction=long) res3-res2[order(res2[,4]),] res4- res3[res3[,3]!=,-4] vec2-paste(res4[,1],res4[,3],paste0(C,res4[,2]),sep=.) res4$PRIMAIRY-vec2%in%vec1 row.names(res4)-1:nrow(res4) res4$ID- row.names(res4) res4[,c(1,3)]- lapply(res4[,c(1,3)],as.character) res5-res4[,c(5,1,3,4)] colnames(res5)[3]-CODE}) # user system elapsed #144.672 2.072 147.034 #reshape() step is taking most of the time dim(res5) #[1] 288 4 #Comparing this to the first method on a smaller subset of dat2. dat2New- dat2[1:3e4,] system.time({ res1-do.call(rbind,lapply(seq_len(nrow(dat2New)),function(i) {x1-as.character(unlist(dat2New[i,-1]));CODE-x1[x1!=];PRIMAIRY-x1[x1!=]==head(x1,1); DSYSRTKY=as.numeric(as.character(dat2[i,1]));data.frame(DSYSRTKY,CODE,PRIMAIRY,stringsAsFactors=FALSE) })) res1$ID- row.names(res1) res2-res1[,c(4,1:3)] }) # user system elapsed #166.452 15.752 182.643 nrow(dat2)-nrow(dat2New) #[1] 33 You might also try library(data.table). Should be faster.. A.K. - Original Message - From: Dark i...@software-solutions.nl To: r-help@r-project.org Cc: Sent: Wednesday, August 14, 2013 5:41 AM Subject: Re: [R] Create rows for columns in dataframe Hi A.K, Thanks for your great help. I'm now running your first suggestion on a 600.000 row sample after verifying it works on a smaller sample. It's now been running for 40 minutes. Which method do you think will be faster? Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673704.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
Hi, This seemed to be faster than the other two methods: vec1- as.character(rep(dat1[,1],each=(ncol(dat1)-1))) vec2- as.character(unlist(t(dat1[,-1]))) vec3- rep(rep(c(TRUE,FALSE),c(1,(ncol(dat1)-2))),nrow(dat1)) dat2-data.frame(DSYSRTKY=vec1,CODE=vec2,PRIMAIRY=vec3,stringsAsFactors=FALSE) dat3- dat2[dat2[,2]!=,] row.names(dat3)- 1:nrow(dat3) dat3New-within(dat3,{ID-row.names(dat3)})[,c(4,1:3)] #Out1## Output dataset Out1$PRIMAIRY- as.logical(Out1$PRIMAIRY) identical(Out1,dat3New) #[1] TRUE #Speed test indx-rep(1:nrow(dat1),6e4) dat2- dat1[indx,] system.time({ vec1- as.character(rep(dat2[,1],each=(ncol(dat2)-1))) vec2- as.character(unlist(t(dat2[,-1]))) vec3- rep(rep(c(TRUE,FALSE),c(1,(ncol(dat2)-2))),nrow(dat2)) dat4-data.frame(DSYSRTKY=vec1,CODE=vec2,PRIMAIRY=vec3,stringsAsFactors=FALSE) dat5- dat4[dat4[,2]!=,] row.names(dat5)- 1:nrow(dat5) dat5New-within(dat5,{ID-row.names(dat5)})[,c(4,1:3)] }) # user system elapsed # 12.620 0.684 13.333 dim(dat5New) #[1] 288 4 A.K. Hi Arun, The second method is indeed working much faster. It worked fast for my 600.000 row record. Still I have 2 bigger files where processing becomes an issue even though I have lots of memory (32 gig) for the second statement: res2-reshape(dat2,idvar=newCol,varying=list(2:26),direction=long) Would data.table also take less memory? Maybe even speed things up would be good. How would I do it? I think splitting the dataframe before merging it might also be an option and after that combining them, any ideas on that? Regards Dirk - Original Message - From: arun smartpink...@yahoo.com To: R help r-help@r-project.org Cc: Sent: Wednesday, August 14, 2013 10:39 AM Subject: Re: [R] Create rows for columns in dataframe Hi, I tried the second method on a bigger dataset. This is what I get, indx-rep(1:nrow(dat1),6e4) dat2- dat1[indx,] system.time({ vec1- paste(dat2[,1],dat2[,2],colnames(dat2)[2],sep=.) res2-reshape(dat2,idvar=newCol,varying=list(2:26),direction=long) res3-res2[order(res2[,4]),] res4- res3[res3[,3]!=,-4] vec2-paste(res4[,1],res4[,3],paste0(C,res4[,2]),sep=.) res4$PRIMAIRY-vec2%in%vec1 row.names(res4)-1:nrow(res4) res4$ID- row.names(res4) res4[,c(1,3)]- lapply(res4[,c(1,3)],as.character) res5-res4[,c(5,1,3,4)] colnames(res5)[3]-CODE}) # user system elapsed #144.672 2.072 147.034 #reshape() step is taking most of the time dim(res5) #[1] 288 4 #Comparing this to the first method on a smaller subset of dat2. dat2New- dat2[1:3e4,] system.time({ res1-do.call(rbind,lapply(seq_len(nrow(dat2New)),function(i) {x1-as.character(unlist(dat2New[i,-1]));CODE-x1[x1!=];PRIMAIRY-x1[x1!=]==head(x1,1); DSYSRTKY=as.numeric(as.character(dat2[i,1]));data.frame(DSYSRTKY,CODE,PRIMAIRY,stringsAsFactors=FALSE) })) res1$ID- row.names(res1) res2-res1[,c(4,1:3)] }) # user system elapsed #166.452 15.752 182.643 nrow(dat2)-nrow(dat2New) #[1] 33 You might also try library(data.table). Should be faster.. A.K. - Original Message - From: Dark i...@software-solutions.nl To: r-help@r-project.org Cc: Sent: Wednesday, August 14, 2013 5:41 AM Subject: Re: [R] Create rows for columns in dataframe Hi A.K, Thanks for your great help. I'm now running your first suggestion on a 600.000 row sample after verifying it works on a smaller sample. It's now been running for 40 minutes. Which method do you think will be faster? Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673704.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
Hi Arun, The second method is indeed working much faster. It worked fast for my 600.000 row record. Still I have 2 bigger files where processing becomes an issue even though I have lots of memory (32 gig) for the second statement: res2-reshape(dat2,idvar=newCol,varying=list(2:26),direction=long) Would data.table also take less memory? Maybe even speed things up would be good. How would I do it? I think splitting the dataframe before merging it might also be an option and after that combining them, any ideas on that? Regards Dirk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673750.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
HI, Your desired output is not clear. May be this helps: #dat1 is the dataset dat1$ID- 1:nrow(dat1) library(reshape2) res1-melt(dat1,id.vars=c(ID,DSYSRTKY)) res1$value-res1$value!= res1[,2]- as.integer(as.character(res1[,2])) res1[,3]-as.character(res1[,3]) colnames(res1)[3:4]-c(CODE,PRIMARY) head(res1) # ID DSYSRTKY CODE PRIMARY #1 1 10005 C1 TRUE #2 2 10203 C1 TRUE #3 3 10315 C1 TRUE #4 4 10315 C1 TRUE #5 5 10327 C1 TRUE #6 6 10327 C1 TRUE A.K. - Original Message - From: Dark i...@software-solutions.nl To: r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 5:46 AM Subject: [R] Create rows for columns in dataframe Hi experts, I have a dataframe with 100k+ records. it has a key/id column and 25 code columns. I would like to restructure it having a row for each code column. I have a structure like this (used dput): structure(list(DSYSRTKY = structure(c(1L, 2L, 3L, 3L, 4L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(10005, 10203, 10315, 10327), class = factor), C1 = structure(c(6L, 3L, 2L, 5L, 1L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(41401, 42831, 45341, 486, 5990, 71535), class = factor), C2 = structure(c(5L, 1L, 3L, 6L, 4L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(4019, 51881, 5990, 6826, 78900, V4986), class = factor), C3 = structure(c(6L, 3L, 5L, 2L, 4L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(5119, 5939, 72400, 7850, 8052, V1251), class = factor), C4 = structure(c(6L, 5L, 3L, 1L, 2L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(3109, 4019, 4241, 42789, V1011, V454), class = factor), C5 = structure(c(1L, 1L, 3L, 1L, 2L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2720, 4019, 7823), class = factor), C6 = structure(c(1L, 1L, 2L, 1L, 4L, 3L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 311, 41400, 49390), class = factor), C7 = structure(c(1L, 1L, 2L, 1L, 3L, 4L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2724, 2859, V4581), class = factor), C8 = structure(c(1L, 1L, 3L, 1L, 4L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 40390, 71680, 79029), class = factor), C9 = structure(c(1L, 1L, 2L, 1L, 4L, 3L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 4168, 5859, V1582), class = factor), C10 = structure(c(1L, 1L, 3L, 1L, 1L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 49390, 7804), class = factor), C11 = structure(c(1L, 1L, 3L, 1L, 1L, 2L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 2724, V066), class = factor), C12 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 6930), class = factor), C13 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 41400), class = factor), C14 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, V4581), class = factor), C15 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 40291), class = factor), C16 = structure(c(1L, 1L, 2L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = c(, 4280), class = factor), C17 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C18 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C19 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C20 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C21 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C22 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C23 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C24 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor), C25 = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Names = c(1, 2, 3, 4, 5, 6), .Label = , class = factor)), .Names = c(DSYSRTKY, C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, C18, C19, C20, C21, C22, C23, C24, C25), row.names = c(1, 2, 3, 4, 5, 6), class = data.frame) Now I want to restructure this dataframe not having 25 code fields but a row for each code but only if the code has a value! The new structure should look something like: NewDataFrame - data.frame(ID=integer(), DSYSRTKY=integer(), CODE=character(), PRIMAIRY=logical()) The ID column should just be an increment. PRIMAIRY is a boolean which should be true if orriginally was the first code (C1). It has to be efficient since my real data has many more rows than my example structure of only 6 rows. I tried some looping mechanism and it was working but it was not performing at all.
Re: [R] Create rows for columns in dataframe
Hi, My desired output for my sample!! using dput(): structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48), DSYSRTKY = c(10005, 10005, 10005, 10005, 10203, 10203, 10203, 10203, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327), CODE = c(71535, 78900, V1251, V454, 45341, 4019, 72400, V1011, 42831, 5990, 8052, 4241, 4019, 311, 2724, 71680, 4168, 7804, V066, 6930, 41400, V4581, 40291, 4280, 5990, V4986, 5939, 3109, 41401, 6826, 7850, 4019, 2720, 49390, 2859, 79029, V1582, 486, 51881, 5119, 42789, 7823, 41400, V4581, 40390, 5859, 49390, 2724), PRIMAIRY = c(TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)), .Names = c(ID, DSYSRTKY, CODE, PRIMAIRY), row.names = c(NA, 48L), class = data.frame) So the 'DSYSRTKY' (10005) has 4 code fields filled so you get 4 rows. The next one also 4, the third one 16. Anyway, just take a look at the sample. I think this will help trying to make clear what my desired result is! Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673646.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
According to your first post, NewDataFrame - data.frame(ID=integer(), DSYSRTKY=integer(), CODE=character(), PRIMAIRY=logical()) The new output dataset: Out1 str(Out1) 'data.frame': 48 obs. of 4 variables: $ ID : chr 1 2 3 4 ... $ DSYSRTKY: chr 10005 10005 10005 10005 ... $ CODE : chr 71535 78900 V1251 V454 ... $ PRIMAIRY: chr TRUE FALSE FALSE FALSE ... I guess you wanted DSYSRTKY to be numeric and PRIMAIRY to be logical res1-do.call(rbind,lapply(seq_len(nrow(dat1)),function(i) {x1-as.character(unlist(dat1[i,-1]));CODE-x1[x1!=];PRIMAIRY-x1[x1!=]==head(x1,1); DSYSRTKY=as.numeric(as.character(dat1[i,1]));data.frame(DSYSRTKY,CODE,PRIMAIRY,stringsAsFactors=FALSE) })) res1$ID- row.names(res1) res2-res1[,c(4,1:3)] str(res2) #'data.frame': 48 obs. of 4 variables: # $ ID : chr 1 2 3 4 ... # $ DSYSRTKY: num 1e+08 1e+08 1e+08 1e+08 1e+08 ... # $ CODE : chr 71535 78900 V1251 V454 ... # $ PRIMAIRY: logi TRUE FALSE FALSE FALSE TRUE FALSE ... head(res2) # ID DSYSRTKY CODE PRIMAIRY #1 1 10005 71535 TRUE #2 2 10005 78900 FALSE #3 3 10005 V1251 FALSE #4 4 10005 V454 FALSE #5 5 10203 45341 TRUE #6 6 10203 4019 FALSE head(Out1) # ID DSYSRTKY CODE PRIMAIRY #1 1 10005 71535 TRUE #2 2 10005 78900 FALSE #3 3 10005 V1251 FALSE #4 4 10005 V454 FALSE #5 5 10203 45341 TRUE #6 6 10203 4019 FALSE A.K. - Original Message - From: Dark i...@software-solutions.nl To: r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 12:16 PM Subject: Re: [R] Create rows for columns in dataframe Hi, My desired output for my sample!! using dput(): structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48), DSYSRTKY = c(10005, 10005, 10005, 10005, 10203, 10203, 10203, 10203, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327), CODE = c(71535, 78900, V1251, V454, 45341, 4019, 72400, V1011, 42831, 5990, 8052, 4241, 4019, 311, 2724, 71680, 4168, 7804, V066, 6930, 41400, V4581, 40291, 4280, 5990, V4986, 5939, 3109, 41401, 6826, 7850, 4019, 2720, 49390, 2859, 79029, V1582, 486, 51881, 5119, 42789, 7823, 41400, V4581, 40390, 5859, 49390, 2724), PRIMAIRY = c(TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)), .Names = c(ID, DSYSRTKY, CODE, PRIMAIRY), row.names = c(NA, 48L), class = data.frame) So the 'DSYSRTKY' (10005) has 4 code fields filled so you get 4 rows. The next one also 4, the third one 16. Anyway, just take a look at the sample. I think this will help trying to make clear what my desired result is! Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673646.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create rows for columns in dataframe
You could also try: ##Out1 is the output dataset Out1$PRIMAIRY-as.logical(Out1$PRIMAIRY) #changing the class #dat1 input dataset vec1- paste(dat1[,1],dat1[,2],colnames(dat1)[2],sep=.) res2-reshape(dat1,idvar=newCol,varying=list(2:26),direction=long) res3-res2[order(res2[,4]),] res4- res3[res3[,3]!=,-4] vec2-paste(res4[,1],res4[,3],paste0(C,res4[,2]),sep=.) res4$PRIMAIRY-vec2%in%vec1 row.names(res4)-1:nrow(res4) res4$ID- row.names(res4) res4[,c(1,3)]- lapply(res4[,c(1,3)],as.character) res5-res4[,c(5,1,3,4)] colnames(res5)[3]-CODE identical(res5,Out1) #[1] TRUE A.K. A.K. - Original Message - From: arun smartpink...@yahoo.com To: R help r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 2:45 PM Subject: Re: [R] Create rows for columns in dataframe According to your first post, NewDataFrame - data.frame(ID=integer(), DSYSRTKY=integer(), CODE=character(), PRIMAIRY=logical()) The new output dataset: Out1 str(Out1) 'data.frame': 48 obs. of 4 variables: $ ID : chr 1 2 3 4 ... $ DSYSRTKY: chr 10005 10005 10005 10005 ... $ CODE : chr 71535 78900 V1251 V454 ... $ PRIMAIRY: chr TRUE FALSE FALSE FALSE ... I guess you wanted DSYSRTKY to be numeric and PRIMAIRY to be logical res1-do.call(rbind,lapply(seq_len(nrow(dat1)),function(i) {x1-as.character(unlist(dat1[i,-1]));CODE-x1[x1!=];PRIMAIRY-x1[x1!=]==head(x1,1); DSYSRTKY=as.numeric(as.character(dat1[i,1]));data.frame(DSYSRTKY,CODE,PRIMAIRY,stringsAsFactors=FALSE) })) res1$ID- row.names(res1) res2-res1[,c(4,1:3)] str(res2) #'data.frame': 48 obs. of 4 variables: # $ ID : chr 1 2 3 4 ... # $ DSYSRTKY: num 1e+08 1e+08 1e+08 1e+08 1e+08 ... # $ CODE : chr 71535 78900 V1251 V454 ... # $ PRIMAIRY: logi TRUE FALSE FALSE FALSE TRUE FALSE ... head(res2) # ID DSYSRTKY CODE PRIMAIRY #1 1 10005 71535 TRUE #2 2 10005 78900 FALSE #3 3 10005 V1251 FALSE #4 4 10005 V454 FALSE #5 5 10203 45341 TRUE #6 6 10203 4019 FALSE head(Out1) # ID DSYSRTKY CODE PRIMAIRY #1 1 10005 71535 TRUE #2 2 10005 78900 FALSE #3 3 10005 V1251 FALSE #4 4 10005 V454 FALSE #5 5 10203 45341 TRUE #6 6 10203 4019 FALSE A.K. - Original Message - From: Dark i...@software-solutions.nl To: r-help@r-project.org Cc: Sent: Tuesday, August 13, 2013 12:16 PM Subject: Re: [R] Create rows for columns in dataframe Hi, My desired output for my sample!! using dput(): structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48), DSYSRTKY = c(10005, 10005, 10005, 10005, 10203, 10203, 10203, 10203, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10315, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327, 10327), CODE = c(71535, 78900, V1251, V454, 45341, 4019, 72400, V1011, 42831, 5990, 8052, 4241, 4019, 311, 2724, 71680, 4168, 7804, V066, 6930, 41400, V4581, 40291, 4280, 5990, V4986, 5939, 3109, 41401, 6826, 7850, 4019, 2720, 49390, 2859, 79029, V1582, 486, 51881, 5119, 42789, 7823, 41400, V4581, 40390, 5859, 49390, 2724), PRIMAIRY = c(TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)), .Names = c(ID, DSYSRTKY, CODE, PRIMAIRY), row.names = c(NA, 48L), class = data.frame) So the 'DSYSRTKY' (10005) has 4 code fields filled so you get 4 rows. The next one also 4, the third one 16. Anyway, just take a look at the sample. I think this will help trying to make clear what my desired result is! Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Create-rows-for-columns-in-dataframe-tp4673607p4673646.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.