Re: [R] transforming a .csv file column names as per a particular column rows using R code
Hi Jeff Newmiller, The script given by you is perfect for my requirement. The rest of the scripts given by other authors are failing to format the file header which are giving like eachdatacolumn.1 or eachdatacolumn.2 etc, whereas I need them as eachdatacolumn_firstactualstepnumber, eachdatacolumn_secondactualstepnumber etc. in my output. - Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137p4646188.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transforming a .csv file column names as per a particular column rows using R code
You can also get the Tool column repeated by tricking reshape() into letting you use Tool in the v.name and idvar arguments: > dta1 <- read.csv( text= + "Tool,Step_Number,Data1,Data2 + A,1,0,1 + A,2,3,2 + A,3,2,3 + B,1,3,2 + B,2,1,2 + B,3,3,2 + ") # Repeat the first column. R will append .1 to the second copy. > dta2 <- reshape(dta1[, c(1, 1:4)], v.names=c("Tool", "Data1", "Data2"), timevar="Step_Number", idvar="Tool.1", direction="wide") > rownames(dta2) <- NULL > colnames(dta2) <- gsub("\\.", "_", colnames(dta2)) > dta2 Tool_1 Data1_1 Data2_1 Tool_2 Data1_2 Data2_2 Tool_3 Data1_3 Data2_3 1 A 0 1 A 3 2 A 2 3 2 B 3 2 B 1 2 B 3 2 -- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352 > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of arun > Sent: Sunday, October 14, 2012 1:29 PM > To: siddu479 > Cc: R help > Subject: Re: [R] transforming a .csv file column names as per a > particular column rows using R code > > HI, > If you need the "Tool" column repeated, > Try this: > dat1<-read.table(text=" > Tool Step_Number Data1 Data2 Data3 > A 1 0 1 2 > A 2 3 1 4 > A 3 2 1 3 > B 1 3 2 1 > B 2 1 2 3 > B 3 3 2 0 > ",sep="",header=TRUE,stringsAsFactors=FALSE) > > dat3<-with(dat1,aggregate(dat1[,3:5],list(Tool=Tool),function(x) x)) > dat3 > #Tool Data1.1 Data1.2 Data1.3 Data2.1 Data2.2 Data2.3 Data3.1 Data3.2 > Data3.3 > #1 A 0 3 2 1 1 1 2 > 4 3 > #2 B 3 1 3 2 2 2 1 > 3 0 > dat4<-do.call(data.frame,lapply(lapply(dat3[,-1],function(x) > x),function(x) cbind(Tool=dat3[,1],x))) > colnames(dat4)<-gsub(".*\\.(Tool)","\\1",gsub("[V]","",colnames(dat4))) > dat4[grepl("Data.*",colnames(dat4))]<- > sapply(dat4[grepl("Data.*",colnames(dat4))],function(x)as.numeric(as.ch > aracter(x))) > dat4 > # Tool Data1.2 Data1.3 Data1.4 Tool Data2.2 Data2.3 Data2.4 Tool > Data3.2 > #1 A 0 3 2 A 1 1 1 A > 2 > #2 B 3 1 3 B 2 2 2 B > 1 > Data3.3 Data3.4 > 1 4 3 > 2 3 0 > str(dat4) > #'data.frame': 2 obs. of 12 variables: > # $ Tool : Factor w/ 2 levels "A","B": 1 2 > # $ Data1.2: num 0 3 > # $ Data1.3: num 3 1 > #$ Data1.4: num 2 3 > #$ Tool : Factor w/ 2 levels "A","B": 1 2 > #$ Data2.2: num 1 2 > #$ Data2.3: num 1 2 > #$ Data2.4: num 1 2 > # $ Tool : Factor w/ 2 levels "A","B": 1 2 > # $ Data3.2: num 2 1 > # $ Data3.3: num 4 3 > # $ Data3.4: num 3 0 > > A.K. > > > > > - Original Message - > From: siddu479 > To: r-help@r-project.org > Cc: > Sent: Sunday, October 14, 2012 12:13 PM > Subject: [R] transforming a .csv file column names as per a particular > column rows using R code > > Hello all, > I have a .csv file like below. > Tool,Step_Number,Data1,Data2... etc up to 100 columns. > A,1,0,1 > A,2,3,1 > A,3,2,1 > . > . > B,1,3,2 > B,2,1,2 > B,3,3,2 > . > . > .. so on upto 50 rows > where the column "*Tool*" has distinct steps in second column > "*Step_Number*",but both have same entries in Step_Number column. > I want the output like below. > > Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... > so > on > A,0,1,A,3,1,A,2,1 > B,3,2,B,1,2,B,3,2 > .. > so on. > > basically I am transposing entire data based on a specific column row > values > and renaming the column headers. > I have a shell script based on awk which can do this task, but the > script is > taking exceptionally higher processing time. > > So I am looking for a script in R which can save the time. > "Please revert to me if the problem description is not clear." > > Regards > Sidda > > > > > - > Sidda > Business Analyst Lead > Applied Materials Inc. > > -- > View this message in context: > http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as- > per-a-particular-column-rows-using-R-code-tp4646137.html > Sent from the R help mailing list archive at Nabble.com. > >
Re: [R] transforming a .csv file column names as per a particular column rows using R code
HI, Just a modification of the column names. I noticed that the "Data." column starts with "2". Try this: dat1<-read.table(text=" Tool Step_Number Data1 Data2 Data3 A 1 0 1 2 A 2 3 1 4 A 3 2 1 3 B 1 3 2 1 B 2 1 2 3 B 3 3 2 0 ",sep="",header=TRUE,stringsAsFactors=FALSE) dat3<-with(dat1,aggregate(dat1[,3:5],list(Tool=Tool),function(x) x)) dat3 dat4<-do.call(data.frame,lapply(lapply(dat3[,-1],function(x) x),function(x) cbind(Tool=dat3[,1],x))) colnames(dat4)<-gsub(".*\\.(Tool)","\\1",gsub("[V]","",colnames(dat4))) dat4[grepl("Data.*",colnames(dat4))]<-sapply(dat4[grepl("Data.*",colnames(dat4))],function(x)as.numeric(as.character(x))) colnames(dat4)[colnames(dat4)%in%colnames(dat4[grep("Data",colnames(dat4))])]<-paste(rep(names(lapply(lapply(dat3[,-1],function(x) x),function(x) cbind(Tool=dat3[,1],x))),each=length(unique(dat1$Step_Number))),"_",1:length(unique(dat1$Step_Number)),sep="") dat4 # Tool Data1_1 Data1_2 Data1_3 Tool Data2_1 Data2_2 Data2_3 Tool Data3_1 #1 A 0 3 2 A 1 1 1 A 2 #2 B 3 1 3 B 2 2 2 B 1 # Data3_2 Data3_3 #1 4 3 #2 3 0 A.K. - Original Message - From: siddu479 To: r-help@r-project.org Cc: Sent: Sunday, October 14, 2012 12:13 PM Subject: [R] transforming a .csv file column names as per a particular column rows using R code Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . .. so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 .. so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda - Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transforming a .csv file column names as per a particular column rows using R code
HI, If you need the "Tool" column repeated, Try this: dat1<-read.table(text=" Tool Step_Number Data1 Data2 Data3 A 1 0 1 2 A 2 3 1 4 A 3 2 1 3 B 1 3 2 1 B 2 1 2 3 B 3 3 2 0 ",sep="",header=TRUE,stringsAsFactors=FALSE) dat3<-with(dat1,aggregate(dat1[,3:5],list(Tool=Tool),function(x) x)) dat3 #Tool Data1.1 Data1.2 Data1.3 Data2.1 Data2.2 Data2.3 Data3.1 Data3.2 Data3.3 #1 A 0 3 2 1 1 1 2 4 3 #2 B 3 1 3 2 2 2 1 3 0 dat4<-do.call(data.frame,lapply(lapply(dat3[,-1],function(x) x),function(x) cbind(Tool=dat3[,1],x))) colnames(dat4)<-gsub(".*\\.(Tool)","\\1",gsub("[V]","",colnames(dat4))) dat4[grepl("Data.*",colnames(dat4))]<-sapply(dat4[grepl("Data.*",colnames(dat4))],function(x)as.numeric(as.character(x))) dat4 # Tool Data1.2 Data1.3 Data1.4 Tool Data2.2 Data2.3 Data2.4 Tool Data3.2 #1 A 0 3 2 A 1 1 1 A 2 #2 B 3 1 3 B 2 2 2 B 1 Data3.3 Data3.4 1 4 3 2 3 0 str(dat4) #'data.frame': 2 obs. of 12 variables: # $ Tool : Factor w/ 2 levels "A","B": 1 2 # $ Data1.2: num 0 3 # $ Data1.3: num 3 1 #$ Data1.4: num 2 3 #$ Tool : Factor w/ 2 levels "A","B": 1 2 #$ Data2.2: num 1 2 #$ Data2.3: num 1 2 #$ Data2.4: num 1 2 # $ Tool : Factor w/ 2 levels "A","B": 1 2 # $ Data3.2: num 2 1 # $ Data3.3: num 4 3 # $ Data3.4: num 3 0 A.K. - Original Message - From: siddu479 To: r-help@r-project.org Cc: Sent: Sunday, October 14, 2012 12:13 PM Subject: [R] transforming a .csv file column names as per a particular column rows using R code Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . .. so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 .. so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda - Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transforming a .csv file column names as per a particular column rows using R code
Hello, Try the following. dta1 <- read.csv( text= "Tool,Step_Number,Data1,Data2 A,1,0,1 A,2,3,2 A,3,2,3 B,1,3,2 B,2,1,2 B,3,3,2 ") sp <- split(dta1[-2], dta1$Tool) dta2 <- do.call(rbind, lapply(sp, function(x) as.vector(unlist(t(x) dta2 <- data.frame(dta2, stringsAsFactors = FALSE) idx <- rep(c(FALSE, TRUE, TRUE), ncol(dta2) %/% 3) dta2[, idx] <- sapply(dta2[, idx], as.integer) Hope this helps, Rui Barradas Em 14-10-2012 17:13, siddu479 escreveu: Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . .. so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 .. so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda - Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transforming a .csv file column names as per a particular column rows using R code
Hi, If you don't want the column "Tool" to be repeated, try this: dat1<-read.table(text=" Tool Step_Number Data1 Data2 A 1 0 1 A 2 3 1 A 3 2 1 B 1 3 2 B 2 1 2 B 3 3 2 ",sep="",header=TRUE,stringsAsFactors=FALSE) with(dat1,aggregate(cbind(Data1,Data2),list(Tool=Tool),function(x) x)) Tool Data1.1 Data1.2 Data1.3 Data2.1 Data2.2 Data2.3 #1 A 0 3 2 1 1 1 #2 B 3 1 3 2 2 2A.K. - Original Message - From: siddu479 To: r-help@r-project.org Cc: Sent: Sunday, October 14, 2012 12:13 PM Subject: [R] transforming a .csv file column names as per a particular column rows using R code Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . .. so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 .. so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda - Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transforming a .csv file column names as per a particular column rows using R code
Not clear to me what your successive rows indicate in your output. However, you may get something similar to what you want using the reshape2 library: library(reshape2) dta1 <- read.csv( text= "Tool,Step_Number,Data1,Data2 A,1,0,1 A,2,3,2 A,3,2,3 B,1,3,2 B,2,1,2 B,3,3,2 ") # put into long form dta1l <- melt(dta1,c("Tool","Step_Number")) # re-cast to wide form # assuming first row is for tool A, second row is for tool B, etc. dta2 <- dcast(dta1l,Tool~variable+Step_Number) On Sun, 14 Oct 2012, siddu479 wrote: Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . .. so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below. Tool_1,Data1_1,Data2_1,Tool_2,Data1_2,Data2_2,Tool_3,Data1_3,Data2_3... so on A,0,1,A,3,1,A,2,1 B,3,2,B,1,2,B,3,2 .. so on. basically I am transposing entire data based on a specific column row values and renaming the column headers. I have a shell script based on awk which can do this task, but the script is taking exceptionally higher processing time. So I am looking for a script in R which can save the time. "Please revert to me if the problem description is not clear." Regards Sidda - Sidda Business Analyst Lead Applied Materials Inc. -- View this message in context: http://r.789695.n4.nabble.com/transforming-a-csv-file-column-names-as-per-a-particular-column-rows-using-R-code-tp4646137.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- Jeff NewmillerThe . . Go Live... DCN:Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.