HI Shreya, #Looks like you run the two line code as a single line. result3<- data.frame(result2[,-5],read.table(text=as.character(result2$comment),sep="|",fill=TRUE,na.strings=""),stringsAsFactors=FALSE)
colnames(result3)[5:7]<- paste0("DataComment",1:3) result3 # Row_ID_CR Data1 Data2 Data3 DataComment1 #1 1 aa bb cc This is comment 1 #2 2 dd ee ff This is comment 1 # DataComment2 DataComment3 #1 This is comment 2 This is comment 3 #2 <NA> <NA> A.K. ________________________________ From: Shreya Rawal <rawal.shr...@gmail.com> To: arun <smartpink...@yahoo.com> Cc: R help <r-help@r-project.org>; jim holtman <jholt...@gmail.com> Sent: Wednesday, June 12, 2013 8:58 AM Subject: Re: [R] Combining CSV data Great, thanks Arun, but I seem to be running into this error. Not sure what did I miss. > result<-data.frame(final_ouput[,-5],read.table(text=as.character(final_output$comment),sep="|",fill=TRUE,na.strings=""),stringsAsFactors=FALSE)colnames(result)[5:7]<-paste0("DataComment",1:3) Error: unexpected symbol in "result<-data.frame(final_ouput[,-5],read.table(text=as.character(final_output$comment),sep="|",fill=TRUE,na.strings=""),stringsAsFactors=FALSE)colnames" On Tue, Jun 11, 2013 at 5:09 PM, arun <smartpink...@yahoo.com> wrote: > > >HI, >You could use: >result3<- >data.frame(result2[,-5],read.table(text=as.character(result2$comment),sep="|",fill=TRUE,na.strings=""),stringsAsFactors=FALSE) >colnames(result3)[5:7]<- paste0("DataComment",1:3) > >A.K. >________________________________ >From: Shreya Rawal <rawal.shr...@gmail.com> >To: arun <smartpink...@yahoo.com> >Sent: Tuesday, June 11, 2013 4:22 PM > >Subject: Re: [R] Combining CSV data > > > >Hey Arun, > >I guess you could guide me with this a little bit. I have been working on the >solution Jim suggested (and also because that I could understand it with my >little knowledge of R :)) > >So with these commands I am able to get the data in this format: > >> fileA <- read.csv(text = "Row_ID_CR, Data1, Data2, Data3 >+ 1, aa, bb, cc >+ 2, dd, ee, ff", as.is = TRUE) >> >> fileB <- read.csv(text = "Row_ID_N, Src_Row_ID, DataN1 >+ 1a, 1, This is comment 1 >+ 2a, 1, This is comment 2 >+ 3a, 2, This is comment 1 >+ 4a, 1, This is comment 3", as.is = TRUE) >> >> # get rid of leading/trailing blanks on comments >> fileB$DataN1 <- gsub("^ *| *$", "", fileB$DataN1) >> >> # merge together >> result <- merge(fileA, fileB, by.x = 'Row_ID_CR', by.y = "Src_Row_ID") >> >> # now partition by Row_ID_CR and aggregate the comments >> result2 <- do.call(rbind, >+ lapply(split(result, result$Row_ID_CR), function(.grp){ >+ cbind(.grp[1L, -c(5,6)], comment = paste(.grp$DataN1, collapse = >'|')) >+ }) >+ ) > >Row_ID_CR Data1 Data2 Data3 > comment >1 1 aa bb cc > This is comment 1| This is comment 2| This is >comment 3 >2 2 dd ee ff > This is comment 1| This is Comment 2 > >I can even split the last column by this: >strsplit(as.character(result2$comment), split='\\|') > >[[1]] >[1] "This is comment 1" "This is comment 2" " This is comment 3" > >[[2]] >[1] "This is comment 1" "This is comment 2" > > >but now I am not sure how to combine everything together. I guess by now you >must have realized how new I am to R :) > >Thanks!! >Shreya > > > > > > >On Tue, Jun 11, 2013 at 1:02 PM, arun <smartpink...@yahoo.com> wrote: > >Hi, >>If the dataset is like this with the comments in the order: >> >>dat2<-read.table(text=" >>Row_ID_N, Src_Row_ID, DataN1 >>1a, 1, This is comment 1 >>2a, 1, This is comment 2 >>3a, 2, This is comment 1 >>4a, 1, This is comment 3 >>",sep=",",header=TRUE,stringsAsFactors=FALSE) >> >>dat3<-read.table(text=" >>Row_ID_N, Src_Row_ID, DataN1 >>1a, 1, This is comment 1 >>2a, 1, This is comment 2 >>3a, 2, This is comment 1 # >> >>4a, 1, This is comment 3 >>5a, 2, This is comment 2 # >> >>",sep=",",header=TRUE,stringsAsFactors=FALSE) >> >> >>library(stringr) >>library(plyr) >>fun1<- function(data1,data2){ >> data2$DataN1<- str_trim(data2$DataN1) >> res<- merge(data1,data2,by.x=1,by.y=2) >> res1<- res[,-5] >> res2<- >>ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize,DataN1=list(DataN1)) >> Mx1<- max(sapply(res2[,5],length)) >> res3<- data.frame(res2[,-5],do.call(rbind,lapply(res2[,5],function(x){ >> c(x,rep(NA,Mx1-length(x))) >> >> })),stringsAsFactors=FALSE) >> colnames(res3)[grep("X",colnames(res3))]<- >>paste0("DataComment",gsub("[[:alpha:]]","",colnames(res3)[grep("X",colnames(res3))])) >> res3 >> } >> >> >>fun1(dat1,dat2) >># Row_ID_CR Data1 Data2 Data3 DataComment1 >>#1 1 aa bb cc This is comment 1 >> >>#2 2 dd ee ff This is comment 1 >># DataComment2 DataComment3 >>#1 This is comment 2 This is comment 3 >>#2 <NA> <NA> >> >> fun1(dat1,dat3) >># Row_ID_CR Data1 Data2 Data3 DataComment1 >>#1 1 aa bb cc This is comment 1 >> >>#2 2 dd ee ff This is comment 1 >> # DataComment2 DataComment3 >>#1 This is comment 2 This is comment 3 >> >>#2 This is comment 2 <NA> >> >> >>Otherwise, you need to provide an example that matches the real dataset. >>A.K. >> >>________________________________ >>From: Shreya Rawal <rawal.shr...@gmail.com> >>To: arun <smartpink...@yahoo.com> >>Cc: R help <r-help@r-project.org> >>Sent: Tuesday, June 11, 2013 12:22 PM >> >>Subject: Re: [R] Combining CSV data >> >> >> >>Hi Arun, >> >>Thanks for your reply. Unfortunately the Comments are just text in the real >>data. There is no way to differentiate based on the value of the Comments >>column. I guess because of that reason I couldn't get your solution to work >>properly. Do you think I can try it for a more general case where we don't >>merger/split the comments based on the values? >> >>Thanks for your help, I appreciate! >> >> >> >>On Mon, Jun 10, 2013 at 10:14 PM, arun <smartpink...@yahoo.com> wrote: >> >>HI, >>>I am not sure about your DataN1 column. If there is any identifier to >>>differentiate the comments (in this case 1,2,3), then it will easier to >>>place that in the correct column. >>> My previous solution is not helpful in situations like these: >>> >>>dat2<-read.table(text=" >>>Row_ID_N, Src_Row_ID, DataN1 >>>1a, 1, This is comment 1 >>>2a, 1, This is comment 2 >>>3a, 2, This is comment 2 >>>4a, 1, This is comment 3 >>>",sep=",",header=TRUE,stringsAsFactors=FALSE) >>>dat3<-read.table(text=" >>> >>>Row_ID_N, Src_Row_ID, DataN1 >>>1a, 1, This is comment 1 >>>2a, 1, This is comment 2 >>>3a, 2, This is comment 3 >>>4a, 1, This is comment 3 >>>5a, 2, This is comment 2 >>>",sep=",",header=TRUE,stringsAsFactors=FALSE) >>> >>> >>>library(stringr) >>>library(plyr) >>>fun1<- function(data1,data2){ >>> data2$DataN1<- str_trim(data2$DataN1) >>> res<- merge(data1,data2,by.x=1,by.y=2) >>> res1<- res[,-5] >>> res2<- >>>ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize,DataN1=list(DataN1)) >>> Mx1<- max(sapply(res2[,5],length)) >>> res3<- data.frame(res2[,-5],do.call(rbind,lapply(res2[,5],function(x){ >>> indx<- as.numeric(gsub("[[:alpha:]]","",x)) >>> x[match(seq(Mx1),indx)] >>> })),stringsAsFactors=FALSE) >>> >>> colnames(res3)[grep("X",colnames(res3))]<- >>>paste0("DataComment",gsub("[[:alpha:]]","",colnames(res3)[grep("X",colnames(res3))])) >>> res3 >>> } >>>fun1(dat1,dat2) >>> >>># Row_ID_CR Data1 Data2 Data3 DataComment1 >>>#1 1 aa bb cc This is comment 1 >>>#2 2 dd ee ff <NA> >>> >>># DataComment2 DataComment3 >>>#1 This is comment 2 This is comment 3 >>>#2 This is comment 2 <NA> >>> fun1(dat1,dat3) >>> >>># Row_ID_CR Data1 Data2 Data3 DataComment1 >>>#1 1 aa bb cc This is comment 1 >>>#2 2 dd ee ff <NA> >>> >>># DataComment2 DataComment3 >>>#1 This is comment 2 This is comment 3 >>>#2 This is comment 2 This is comment 3 >>> >>> >>> >>>A.K. >>> >>> >>>----- Original Message ----- >>> >>>From: arun <smartpink...@yahoo.com> >>>To: Shreya Rawal <rawal.shr...@gmail.com> >>>Cc: R help <r-help@r-project.org> >>>Sent: Monday, June 10, 2013 6:41 PM >>>Subject: Re: [R] Combining CSV data >>> >>>Hi, >>>Try this: >>> >>>dat1<-read.table(text=" >>>Row_ID_CR, Data1, Data2, Data3 >>>1, aa, bb, cc >>>2, dd, ee, ff >>>",sep=",",header=TRUE,stringsAsFactors=FALSE) >>> >>>dat2<-read.table(text=" >>>Row_ID_N, Src_Row_ID, DataN1 >>>1a, 1, This is comment 1 >>>2a, 1, This is comment 2 >>>3a, 2, This is comment 1 >>>4a, 1, This is comment 3 >>>",sep=",",header=TRUE,stringsAsFactors=FALSE) >>>library(stringr) >>>dat2$DataN1<-str_trim(dat2$DataN1) >>>res<- merge(dat1,dat2,by.x=1,by.y=2) >>> res1<-res[,-5] >>>library(plyr) >>> res2<-ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize, >>>DataN1=list(DataN1)) >>> res2 >>> # Row_ID_CR Data1 Data2 Data3 >>>#1 1 aa bb cc >>>#2 2 dd ee ff >>># DataN1 >>>#1 This is comment 1, This is comment 2, This is comment 3 >>>#2 This is comment 1 >>> >>> >>> >>>res3<-data.frame(res2[,-5],t(apply(do.call(rbind,res2[,5]),1,function(x) >>>{x[duplicated(x)]<-NA;x}))) >>> colnames(res3)[grep("X",colnames(res3))]<- >>>paste0("DataComment",gsub("[[:alpha:]]","",colnames(res3)[grep("X",colnames(res3))])) >>>res3 >>># Row_ID_CR Data1 Data2 Data3 DataComment1 >>>#1 1 aa bb cc This is comment 1 >>>#2 2 dd ee ff This is comment 1 >>># DataComment2 DataComment3 >>>#1 This is comment 2 This is comment 3 >>>#2 <NA> <NA> >>> >>>A.K. >>> >>> >>>----- Original Message ----- >>>From: Shreya Rawal <rawal.shr...@gmail.com> >>>To: r-help@r-project.org >>>Cc: >>>Sent: Monday, June 10, 2013 4:38 PM >>>Subject: [R] Combining CSV data >>> >>>Hello R community, >>> >>>I am trying to combine two CSV files that look like this: >>> >>>File A >>> >>>Row_ID_CR, Data1, Data2, Data3 >>>1, aa, bb, cc >>>2, dd, ee, ff >>> >>> >>>File B >>> >>>Row_ID_N, Src_Row_ID, DataN1 >>>1a, 1, This is comment 1 >>>2a, 1, This is comment 2 >>>3a, 2, This is comment 1 >>>4a, 1, This is comment 3 >>> >>>And the output I am looking for is, comparing the values of Row_ID_CR and >>>Src_Row_ID >>> >>>Output >>> >>>ROW_ID_CR, Data1, Data2, Data3, DataComment1, >>>DataComment2, DataComment3 >>>1, aa, bb, cc, This is >>>comment1, This is comment2, This is comment 3 >>>2, dd, ee, ff, This is >>>comment1 >>> >>> >>>I am a novice R user, I am able to replicate a left join but I need a bit >>>more in the final result. >>> >>> >>>Thanks!! >>> >>> [[alternative HTML version deleted]] >>> >>>______________________________________________ >>>R-help@r-project.org mailing list >>>https://stat.ethz.ch/mailman/listinfo/r-help >>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>and provide commented, minimal, self-contained, reproducible code. >>> >>> >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.